Brief IA

AI Agents and Python: Revolutionizing Web Navigation in 2026

🛠️ AI Tools·Tom Levy·

AI Agents and Python: Revolutionizing Web Navigation in 2026

AI Agents and Python: Revolutionizing Web Navigation in 2026
Key Takeaways
1Traditional AI agents often limit themselves to APIs, covering only a fraction of human tasks.
2Playwright outperforms Selenium in speed and compatibility for browser automation in 2026.
3Setting up a browser agent in Python requires specific steps, including the installation of Playwright.
💡Why it mattersThe growing use of AI agents in web browsing could transform automation processes in businesses.
Le brief IA que lisent les pros

Le brief IA que les pros lisent chaque soir

Les 7 actus IA du jour, décryptées en 5 min. Gratuit.

Inclus dès l'inscription : notre sélection des meilleurs guides & comparatifs IA.

Choisis ton rythme

Gratuit · Pas de spam · Désabonnement en 1 clic

📄
Full Analysis

Introduction

In the world of artificial intelligence agents, most tutorials start with the use of an API. These guides show you how to interact with services like OpenWeather or Stripe, or how to extract data from GitHub. This works well until you want to create something more concrete and realize that the task at hand is not covered by an API.

Consider the daily actions of users on browsers: filling out administrative forms, comparing competitor prices, extracting data from JavaScript-protected sites, or logging into portals without OAuth. With around 1.1 billion websites online, only a tiny fraction offers public APIs. The majority interact solely through browsers.

An AI agent limited to API calls can only handle about 5% of the tasks that a human accomplishes daily. By equipping this agent with a browser, its capacity for action approaches that of a human user. This article aims to bridge that gap.

The AI agent market is estimated to reach $10.91 billion by 2026 and could hit $50.31 billion by 2030, with a growing share of agents capable of browsing the web. Currently, 27.7% of companies are using browser agents in production, a figure that was nearly zero two years ago. Tools have evolved rapidly, and models are now mature enough to be taught effectively.

By the end of this article, you will be able to create a functional browser agent that can navigate real websites, fill out forms, extract structured data, and connect to a LLM to decide on the next actions, all using Python.

Why Choose Playwright Over Selenium

Five years ago, browser automation was primarily done with Selenium. While Selenium is still widely used and reliable, for any new project in 2026, Playwright has become the default choice. The reasons for this shift are practical.

Selenium operates by sending individual HTTP requests to a WebDriver for each action, whether it's a click, input, or scroll. In contrast, Playwright uses a persistent WebSocket connection for the entire session, allowing commands to flow without the latency cost per action. Independent benchmarks show that Playwright is 30 to 50% faster than Selenium, averaging 290 ms per action compared to 536 ms for Selenium. For a browser agent potentially executing hundreds of actions, this performance gap is significant.

Moreover, Playwright includes its own browser binaries. When you install it, you get preconfigured versions of Chromium, Firefox, and WebKit, ensuring compatibility with your version of Playwright. This eliminates version compatibility issues with drivers and interruptions in CI pipelines due to Chrome updates. Playwright also includes an automatic wait feature before clicking on an element, checking that the element is visible, enabled, and not animated, thus avoiding the need for time.sleep(2).

For AI agents, Playwright simulates real mouse and keyboard events, mimicking human interactions with browsers. Sites designed to detect automation look for synthetic DOM clicks, but Playwright's interaction model is harder to distinguish from actual human input.

Setting Up the Environment

To get started, you will need Python 3.10 or a newer version, an OpenAI API key, and a few minutes for setup.

  • Step 1: Create a Virtual Environment
python -m venv browser_agent_env
# macOS / Linux
source browser_agent_env/bin/activate
# Windows
browser_agent_env\Scripts\activate
  • Step 2: Install Dependencies
pip install playwright \
browser-use \
langchain \
langchain-openai \
langgraph \
langchain-community \
python-dotenv
  • Step 3: Install Browser Binaries

This step is often overlooked. Playwright needs to download Chromium, Firefox, and WebKit separately from the Python package. Run this command after installation:

playwright install chromium

If you want to install all three browser engines, use playwright install. However, Chromium alone is usually sufficient for most agent tasks and is lighter to download.

  • Step 4: Store Your API Key

Create a .env file in your project directory:

OPENAI_API_KEY=your_openai_api_key_here

Add .env to your .gitignore file immediately to avoid compromising your API keys.

  • Step 5: Verify Everything Works

Here’s a basic script that navigates to a URL, reads the page title, and saves a screenshot. Use example.com, a public test domain maintained by IANA, which will not block you.

How to Run: Save the script as first_run.py and execute python first_run.py.

import asyncio
from playwright.async_api import async_playwright

async def main():
    async with async_playwright() as p:
        browser = await p.chromium.launch(headless=True)
        context = await browser.new_context(
            viewport={"width": 1280, "height": 720},
            user_agent=(
                "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                "AppleWebKit/537.36 (KHTML, like Gecko) "
                "Chrome/120.0.0.0 Safari/537.36"
            )
        )
        page = await context.new_page()
        await page.goto("https://example.com", wait_until="networkidle")
        title = await page.title()
        print(f"Page title: {title}")
        h1 = await page.text_content("h1")
        print(f"H1 heading: {h1}")
        await page.screenshot(path="screenshot.png", full_page=True)
        print("Screenshot saved to screenshot.png")
        await browser.close()

asyncio.run(main())

This script uses async_playwright() as the entry point for the entire Playwright session. The browser_context is equivalent to opening a new private browsing window, isolating cookies, local storage, and cache from other contexts. The parameter wait_until="networkidle" tells Playwright to wait until the page has completed all its network activity before proceeding, which is the safest wait strategy for dynamic pages.

If the script runs correctly and saves a screenshot, it means your environment is set up properly.

Web Navigation and Scraping

The need to use Playwright instead of requests combined with BeautifulSoup lies in JavaScript rendering. Modern websites often deliver a minimal HTML skeleton and build the actual content dynamically after the page loads, using frameworks like React,...

Brief IA — L'actualité IA en français

L'essentiel de l'actualité de l'intelligence artificielle, décrypté et expliqué chaque jour.