
Opencli Operate
Let your coding agent drive Chrome through real logged-in sessions for QA, ops tasks, and web workflows without handing over passwords or a separate LLM vision bill.
Overview
OpenCLI Operate is an agent skill most often used in Build (also Ship, Operate) that drives Chrome step-by-step via CLI using your existing login sessions—no separate LLM API key required.
Install
npx skills add https://github.com/jackwener/opencli --skill opencli-operateWhat is this skill?
- Step-by-step Chrome control via CLI: navigate, click, type, select, extract, and wait with structured `[N]` element indi
- Reuses existing Chrome login sessions via Browser Bridge—no password replay and no separate LLM API key for browsing
- Five critical rules: prefer `state` over `screenshot`, `click`/`type` over `eval`, verify with `get value`, refresh `sta
- Prerequisite health check via `opencli doctor` for extension + daemon connectivity
- allowed-tools pattern: Bash(opencli:*), Read, Edit, Write for agent orchestration
- 5 documented critical rules (state over screenshot, click/type over eval, get value verification, state after page chang
Adoption & trust: 1.9k installs on skills.sh; 23.8k GitHub stars; 0/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need your agent to use real websites you are already logged into, but screenshots and ad-hoc JavaScript clicks are slow, expensive, and unreliable.
Who is it for?
Solo builders who already run Chrome with OpenCLI Bridge and want agents to operate authenticated web UIs for checks, data entry, or internal workflows.
Skip if: Headless-only CI pipelines with no desktop Chrome, teams that forbid browser automation on production accounts, or jobs that need full visual regression instead of structured DOM state.
When should I use this skill?
You need AI agents to navigate, click, type, extract, or wait on websites using Chrome with existing login sessions via OpenCLI.
What do I get? / Deliverables
Your agent gets a repeatable CLI playbook to inspect pages with `state`, interact by index, verify inputs, and complete multi-step web tasks inside authenticated Chrome.
- Completed multi-step browser workflow executed via OpenCLI CLI
- Structured page snapshots from `state` with element indices for follow-up commands
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
OpenCLI Operate is agent-facing browser infrastructure—its primary shelf is Build because solo builders install it to extend what Claude/Cursor can do on the web during product work. agent-tooling is the canonical home for skills that wire external runtimes (Chrome + OpenCLI daemon) into the agent loop, not for shipping a customer-facing feature.
Where it fits
Automate filling a third-party admin panel while integrating a webhook or OAuth app.
Walk through checkout or onboarding in staging with indexed clicks instead of manual QA.
Pull live store-console fields or screenshots only when explicitly requested, after structuring the page with `state`.
How it compares
Agent-side browser operator using your live Chrome session—not a hosted headless scraper API or a pure Playwright codegen skill.
Common Questions / FAQ
Who is opencli-operate for?
Indie and solo builders using Claude Code, Cursor, or similar agents who need dependable browser automation on sites where they are already signed in via Chrome.
When should I use opencli-operate?
During Build when wiring agent tooling for web tasks; during Ship when manually verifying flows in staging; during Operate when repeating dashboard or support actions; whenever SKILL.md rules say to use `state` and indexed clicks instead of screenshots.
Is opencli-operate safe to install?
It can execute real clicks and typing in your browser and shell—review the Security Audits panel on this page, confirm OpenCLI extension source, and avoid connecting it to high-privilege accounts until you trust the setup.
SKILL.md
READMESKILL.md - Opencli Operate
# OpenCLI Operate — Browser Automation for AI Agents Control Chrome step-by-step via CLI. Reuses existing login sessions — no passwords needed. ## Prerequisites ```bash opencli doctor # Verify extension + daemon connectivity ``` Requires: Chrome running + OpenCLI Browser Bridge extension installed. ## Critical Rules 1. **ALWAYS use `state` to inspect the page, NEVER use `screenshot`** — `state` returns structured DOM with `[N]` element indices, is instant and costs zero tokens. `screenshot` requires vision processing and is slow. Only use `screenshot` when the user explicitly asks to save a visual. 2. **ALWAYS use `click`/`type`/`select` for interaction, NEVER use `eval` to click or type** — `eval "el.click()"` bypasses scrollIntoView and CDP click pipeline, causing failures on off-screen elements. Use `state` to find the `[N]` index, then `click <N>`. 3. **Verify inputs with `get value`, not screenshots** — after `type`, run `get value <index>` to confirm. 4. **Run `state` after every page change** — after `open`, `click` (on links), `scroll`, always run `state` to see the new elements and their indices. Never guess indices. 5. **Chain commands aggressively with `&&`** — combine `open + state`, multiple `type` calls, and `type + get value` into single `&&` chains. Each tool call has overhead; chaining cuts it. 6. **`eval` is read-only** — use `eval` ONLY for data extraction (`JSON.stringify(...)`), never for clicking, typing, or navigating. Always wrap in IIFE to avoid variable conflicts: `eval "(function(){ const x = ...; return JSON.stringify(x); })()"`. 7. **Minimize total tool calls** — plan your sequence before acting. A good task completion uses 3-5 tool calls, not 15-20. Combine `open + state` as one call. Combine `type + type + click` as one call. Only run `state` separately when you need to discover new indices. 8. **Prefer `network` to discover APIs** — most sites have JSON APIs. API-based adapters are more reliable than DOM scraping. ## Command Cost Guide | Cost | Commands | When to use | |------|----------|-------------| | **Free & instant** | `state`, `get *`, `eval`, `network`, `scroll`, `keys` | Default — use these | | **Free but changes page** | `open`, `click`, `type`, `select`, `back` | Interaction — run `state` after | | **Expensive (vision tokens)** | `screenshot` | ONLY when user needs a saved image | ## Action Chaining Rules Commands can be chained with `&&`. The browser persists via daemon, so chaining is safe. **Always chain when possible** — fewer tool calls = faster completion: ```bash # GOOD: open + inspect in one call (saves 1 round trip) opencli operate open https://example.com && opencli operate state # GOOD: fill form in one call (saves 2 round trips) opencli operate type 3 "hello" && opencli operate type 4 "world" && opencli operate click 7 # GOOD: type + verify in one call opencli operate type 5 "test@example.com" && opencli operate get value 5 # GOOD: click + wait + state in one call (for page-changing clicks) opencli operate click 12 && opencli operate wait time 1 && opencli operate state # BAD: separate calls for each action (wasteful) opencli operate type 3 "hello" # Don't do this opencli operate type 4 "world" # when you can chain opencli operate click 7 # all three together ``` **Page-changing — always put last** in a chain (subsequent commands see stale indices): - `open <url>`, `back`, `click <link/button that navigates>` **Rule**: Chain when you already know the indices. Run `state` separately when you need to discover indices first. ## Core Workflow 1. **Navigate**: `opencli operate open <url>` 2. **Inspect**: `opencli operate state` → elements with `[N]` indices 3. **Interact**: use indices — `click`, `type`,