
Agent Browser Automation
Give coding agents a fast headless browser CLI for navigation, forms, snapshots, and screenshots without pulling in a full Playwright Node stack.
Overview
agent-browser-automation is an agent skill most often used in Build (also Validate and Ship) that teaches a Rust CDP browser CLI for navigation, interaction, snapshots, and screenshots.
Install
npx skills add https://github.com/aradotso/trending-skills --skill agent-browser-automationWhat is this skill?
- Native Rust CLI over Chrome DevTools Protocol—no Node.js or Playwright runtime required
- Install modes: npm global (+ agent-browser install), Homebrew, Cargo, local npm/npx, and Linux --with-deps for system li
- Supports open/navigate, click/fill, accessibility snapshots, screenshots, network interception, and batched commands
- First-run downloads Chrome for Testing via agent-browser install
- Designed for AI agents from ara.so Daily 2026 Skills with trigger phrases for scraping and form automation
Adoption & trust: 1.7k installs on skills.sh; 31 GitHub stars; 1/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your agent needs to drive a real browser for scraping or UI checks but you do not want a Node Playwright stack slowing every automation task.
Who is it for?
Indie builders shipping agent workflows that need headless Chrome control, product screenshots, or structured accessibility snapshots from the terminal.
Skip if: Teams that already standardize on Playwright test suites inside CI with no agent-driven CLI requirement.
When should I use this skill?
When automating browsers via CLI: scraping, screenshots, accessibility snapshots, form interaction, or batch browser commands with agent-browser.
What do I get? / Deliverables
You install agent-browser via your preferred channel, bootstrap Chrome for Testing once, and run CLI commands your agent can script for repeatable browser tasks.
- Installed agent-browser CLI with Chrome for Testing
- Scriptable navigation and interaction command sequences
- Screenshots and accessibility snapshots from live pages
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Browser automation is primarily agent tooling you install while building scrapers, E2E helpers, and agent workflows, even though you may invoke it later in Ship or Grow. Agent-tooling is the canonical shelf because the skill documents the Rust CDP CLI install paths and command workflow agents call from terminals.
Where it fits
Wire agent-browser open and fill commands into a coding agent loop that files support tickets from a web portal.
Capture post-deploy screenshots of checkout flows to compare against baseline creatives.
How it compares
Headless Rust CDP CLI skill for agents—not a hosted browser SaaS or in-editor MCP unless you wire one yourself.
Common Questions / FAQ
Who is agent-browser-automation for?
Solo builders and agent authors who want a lightweight browser automation CLI callable from Claude Code, Cursor, Codex, or generic agent shells.
When should I use agent-browser-automation?
While building agent-tooling integrations, validating competitor pages during research, or capturing screenshots and DOM snapshots before launch QA.
Is agent-browser-automation safe to install?
Installing a global CLI grants local browser control; review the Security Audits panel on this Prism page and restrict network targets in untrusted environments.
SKILL.md
READMESKILL.md - Agent Browser Automation
# agent-browser > Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection. `agent-browser` is a headless browser automation CLI built in Rust, designed for AI agents. It wraps Chrome via the Chrome DevTools Protocol (CDP) and exposes a fast, ergonomic command-line interface for navigation, interaction, accessibility snapshots, screenshots, network interception, and more — with no Node.js or Playwright runtime required. ## Installation ### Recommended (npm global) ```bash npm install -g agent-browser agent-browser install # Download Chrome for Testing (first time only) ``` ### macOS (Homebrew) ```bash brew install agent-browser agent-browser install ``` ### Rust / Cargo ```bash cargo install agent-browser agent-browser install ``` ### Local project dependency ```bash npm install agent-browser # Add to package.json scripts or invoke via npx ``` ### Linux (with system dependencies) ```bash agent-browser install --with-deps ``` ## Quick Start ```bash agent-browser open https://example.com agent-browser snapshot # Accessibility tree with @refs (best for AI) agent-browser click @e2 # Click by ref from snapshot agent-browser fill @e3 "hello@example.com" # Fill by ref agent-browser get text @e1 # Get text content agent-browser screenshot page.png agent-browser close ``` ## Core Commands ### Navigation ```bash agent-browser open <url> # Navigate (aliases: goto, navigate) agent-browser get url # Get current URL agent-browser get title # Get page title agent-browser close # Close browser (aliases: quit, exit) ``` ### Accessibility Snapshot (recommended for AI agents) ```bash agent-browser snapshot # Returns accessibility tree with @ref IDs agent-browser snapshot -i # Interactive / compact mode ``` Snapshot output includes `@eN` refs you can use directly: ``` @e1 [button] "Submit" @e2 [textbox] "Email" value="" @e3 [link] "Sign in" ``` Then act on them: ```bash agent-browser fill @e2 "user@example.com" agent-browser click @e1 ``` ### Interaction ```bash agent-browser click <sel> # Click element agent-browser dblclick <sel> # Double-click agent-browser fill <sel> <text> # Clear and fill input agent-browser type <sel> <text> # Type into element agent-browser press <key> # Press key (Enter, Tab, Control+a) agent-browser keyboard type <text> # Type at current focus (real keystrokes) agent-browser keyboard inserttext <text> # Insert text without key events agent-browser hover <sel> # Hover element agent-browser select <sel> <value> # Select dropdown option agent-browser check <sel> # Check checkbox agent-browser uncheck <sel> # Uncheck checkbox agent-browser scroll down 500 # Scroll (up/down/left/right, optional px) agent-browser scroll down --selector "#feed" # Scroll within element agent-browser scrollintoview <sel> # Scroll element into view agent-browser drag <src> <target> # Drag and drop agent-browser upload <sel> /path/file.pdf # Upload file ``` ### Screenshots & PDF ```bash agent-browser screenshot # Save to temp dir, print path agent-browser screenshot page.png # Save to path agent-browser screenshot --full page.png # Full-page scre