Opencli Operate

Name: Opencli Operate
Author: jackwener

jackwener/opencli

Let your coding agent drive Chrome through real logged-in sessions for QA, ops tasks, and web workflows without handing over passwords or a separate LLM vision bill.

Overview

OpenCLI Operate is an agent skill most often used in Build (also Ship, Operate) that drives Chrome step-by-step via CLI using your existing login sessions—no separate LLM API key required.

Install

npx skills add https://github.com/jackwener/opencli --skill opencli-operate

What is this skill?

Step-by-step Chrome control via CLI: navigate, click, type, select, extract, and wait with structured `[N]` element indi
Reuses existing Chrome login sessions via Browser Bridge—no password replay and no separate LLM API key for browsing
Five critical rules: prefer `state` over `screenshot`, `click`/`type` over `eval`, verify with `get value`, refresh `sta
Prerequisite health check via `opencli doctor` for extension + daemon connectivity
allowed-tools pattern: Bash(opencli:*), Read, Edit, Write for agent orchestration
5 documented critical rules (state over screenshot, click/type over eval, get value verification, state after page chang

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1.9k installs on skills.sh; 23.8k GitHub stars; 0/3 security scanners passed (skills.sh audits).

What problem does it solve?

You need your agent to use real websites you are already logged into, but screenshots and ad-hoc JavaScript clicks are slow, expensive, and unreliable.

Who is it for?

Solo builders who already run Chrome with OpenCLI Bridge and want agents to operate authenticated web UIs for checks, data entry, or internal workflows.

Skip if: Headless-only CI pipelines with no desktop Chrome, teams that forbid browser automation on production accounts, or jobs that need full visual regression instead of structured DOM state.

When should I use this skill?

You need AI agents to navigate, click, type, extract, or wait on websites using Chrome with existing login sessions via OpenCLI.

What do I get? / Deliverables

Your agent gets a repeatable CLI playbook to inspect pages with `state`, interact by index, verify inputs, and complete multi-step web tasks inside authenticated Chrome.

Completed multi-step browser workflow executed via OpenCLI CLI
Structured page snapshots from `state` with element indices for follow-up commands

Recommended Skills

Agent Browservercel-labs/agent-browser

agent-browser is a Node-installed browser automation CLI built for AI agents that need dependable programmatic web inter…428k installs·35.5k stars

Lark Imlarksuite/cli

Lark IM is a Larksuite agent skill that exposes Feishu/Lark instant messaging to Claude Code, Cursor, and similar agents…210k installs·13.7k stars

Lark Calendarlarksuite/cli

lark-calendar is an agent skill for Feishu/Lark Calendar v4 exposed via lark-cli. Solo builders and small teams who alre…209k installs·13.7k stars

Lark Sheetslarksuite/cli

Skill for programmatic Feishu spreadsheet and worksheet management—create tables, bulk data IO, lookup, and export—using…209k installs·13.7k stars

Lark Vclarksuite/cli

lark-vc is an agent skill for Feishu/Lark video conferencing history and artifacts through lark-cli. After calls end, so…208k installs·13.7k stars

Lark Contactlarksuite/cli

CLI skill for Lark directory lookup: search employees and fetch metadata by open_id, with clear boundaries vs IM, calend…208k installs·13.7k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

BuildAgent skills & templates

OpenCLI Operate is agent-facing browser infrastructure—its primary shelf is Build because solo builders install it to extend what Claude/Cursor can do on the web during product work. agent-tooling is the canonical home for skills that wire external runtimes (Chrome + OpenCLI daemon) into the agent loop, not for shipping a customer-facing feature.

Also useful

ShipTesting & QA

Also useful

OperateIteration & experiments

Where it fits

Example use

BuildIntegrations & version control

Automate filling a third-party admin panel while integrating a webhook or OAuth app.

Example use

ShipTesting & QA

Walk through checkout or onboarding in staging with indexed clicks instead of manual QA.

Example use

LaunchApp-store optimization

Pull live store-console fields or screenshots only when explicitly requested, after structuring the page with `state`.

How it compares

Agent-side browser operator using your live Chrome session—not a hosted headless scraper API or a pure Playwright codegen skill.

Common Questions / FAQ

Who is opencli-operate for?

Indie and solo builders using Claude Code, Cursor, or similar agents who need dependable browser automation on sites where they are already signed in via Chrome.

When should I use opencli-operate?

During Build when wiring agent tooling for web tasks; during Ship when manually verifying flows in staging; during Operate when repeating dashboard or support actions; whenever SKILL.md rules say to use `state` and indexed clicks instead of screenshots.

Is opencli-operate safe to install?

It can execute real clicks and typing in your browser and shell—review the Security Audits panel on this page, confirm OpenCLI extension source, and avoid connecting it to high-privilege accounts until you trust the setup.

SKILL.md

READMESKILL.md - Opencli Operate

# OpenCLI Operate — Browser Automation for AI Agents

Control Chrome step-by-step via CLI. Reuses existing login sessions — no passwords needed.

## Prerequisites

```bash
opencli doctor    # Verify extension + daemon connectivity
```

Requires: Chrome running + OpenCLI Browser Bridge extension installed.

## Critical Rules

1. **ALWAYS use `state` to inspect the page, NEVER use `screenshot`** — `state` returns structured DOM with `[N]` element indices, is instant and costs zero tokens. `screenshot` requires vision processing and is slow. Only use `screenshot` when the user explicitly asks to save a visual.
2. **ALWAYS use `click`/`type`/`select` for interaction, NEVER use `eval` to click or type** — `eval "el.click()"` bypasses scrollIntoView and CDP click pipeline, causing failures on off-screen elements. Use `state` to find the `[N]` index, then `click <N>`.
3. **Verify inputs with `get value`, not screenshots** — after `type`, run `get value <index>` to confirm.
4. **Run `state` after every page change** — after `open`, `click` (on links), `scroll`, always run `state` to see the new elements and their indices. Never guess indices.
5. **Chain commands aggressively with `&&`** — combine `open + state`, multiple `type` calls, and `type + get value` into single `&&` chains. Each tool call has overhead; chaining cuts it.
6. **`eval` is read-only** — use `eval` ONLY for data extraction (`JSON.stringify(...)`), never for clicking, typing, or navigating. Always wrap in IIFE to avoid variable conflicts: `eval "(function(){ const x = ...; return JSON.stringify(x); })()"`.
7. **Minimize total tool calls** — plan your sequence before acting. A good task completion uses 3-5 tool calls, not 15-20. Combine `open + state` as one call. Combine `type + type + click` as one call. Only run `state` separately when you need to discover new indices.
8. **Prefer `network` to discover APIs** — most sites have JSON APIs. API-based adapters are more reliable than DOM scraping.

## Command Cost Guide

| Cost | Commands | When to use |
|------|----------|-------------|
| **Free & instant** | `state`, `get *`, `eval`, `network`, `scroll`, `keys` | Default — use these |
| **Free but changes page** | `open`, `click`, `type`, `select`, `back` | Interaction — run `state` after |
| **Expensive (vision tokens)** | `screenshot` | ONLY when user needs a saved image |

## Action Chaining Rules

Commands can be chained with `&&`. The browser persists via daemon, so chaining is safe.

**Always chain when possible** — fewer tool calls = faster completion:
```bash
# GOOD: open + inspect in one call (saves 1 round trip)
opencli operate open https://example.com && opencli operate state

# GOOD: fill form in one call (saves 2 round trips)
opencli operate type 3 "hello" && opencli operate type 4 "world" && opencli operate click 7

# GOOD: type + verify in one call
opencli operate type 5 "test@example.com" && opencli operate get value 5

# GOOD: click + wait + state in one call (for page-changing clicks)
opencli operate click 12 && opencli operate wait time 1 && opencli operate state

# BAD: separate calls for each action (wasteful)
opencli operate type 3 "hello"    # Don't do this
opencli operate type 4 "world"    # when you can chain
opencli operate click 7            # all three together
```

**Page-changing — always put last** in a chain (subsequent commands see stale indices):
- `open <url>`, `back`, `click <link/button that navigates>`

**Rule**: Chain when you already know the indices. Run `state` separately when you need to discover indices first.

## Core Workflow

1. **Navigate**: `opencli operate open <url>`
2. **Inspect**: `opencli operate state` → elements with `[N]` indices
3. **Interact**: use indices — `click`, `type`,

What is this skill?

Step-by-step Chrome control via CLI: navigate, click, type, select, extract, and wait with structured `[N]` element indi

Reuses existing Chrome login sessions via Browser Bridge—no password replay and no separate LLM API key for browsing

Five critical rules: prefer `state` over `screenshot`, `click`/`type` over `eval`, verify with `get value`, refresh `sta

Prerequisite health check via `opencli doctor` for extension + daemon connectivity

allowed-tools pattern: Bash(opencli:*), Read, Edit, Write for agent orchestration

5 documented critical rules (state over screenshot, click/type over eval, get value verification, state after page chang

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1.9k installs on skills.sh; 23.8k GitHub stars; 0/3 security scanners passed (skills.sh audits).

Who is it for?

Solo builders who already run Chrome with OpenCLI Bridge and want agents to operate authenticated web UIs for checks, data entry, or internal workflows.

Skip if: Headless-only CI pipelines with no desktop Chrome, teams that forbid browser automation on production accounts, or jobs that need full visual regression instead of structured DOM state.

What do I get? / Deliverables

Your agent gets a repeatable CLI playbook to inspect pages with `state`, interact by index, verify inputs, and complete multi-step web tasks inside authenticated Chrome.

Completed multi-step browser workflow executed via OpenCLI CLI

Structured page snapshots from `state` with element indices for follow-up commands

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

BuildAgent skills & templates

Also useful

ShipTesting & QA

Also useful

OperateIteration & experiments

Where it fits

Example use

BuildIntegrations & version control

Automate filling a third-party admin panel while integrating a webhook or OAuth app.

Example use

ShipTesting & QA

Walk through checkout or onboarding in staging with indexed clicks instead of manual QA.

Example use

LaunchApp-store optimization

Pull live store-console fields or screenshots only when explicitly requested, after structuring the page with `state`.

SKILL.md

READMESKILL.md - Opencli Operate

# OpenCLI Operate — Browser Automation for AI Agents

Control Chrome step-by-step via CLI. Reuses existing login sessions — no passwords needed.

## Prerequisites

```bash
opencli doctor    # Verify extension + daemon connectivity
```

Requires: Chrome running + OpenCLI Browser Bridge extension installed.

## Critical Rules

1. **ALWAYS use `state` to inspect the page, NEVER use `screenshot`** — `state` returns structured DOM with `[N]` element indices, is instant and costs zero tokens. `screenshot` requires vision processing and is slow. Only use `screenshot` when the user explicitly asks to save a visual.
2. **ALWAYS use `click`/`type`/`select` for interaction, NEVER use `eval` to click or type** — `eval "el.click()"` bypasses scrollIntoView and CDP click pipeline, causing failures on off-screen elements. Use `state` to find the `[N]` index, then `click <N>`.
3. **Verify inputs with `get value`, not screenshots** — after `type`, run `get value <index>` to confirm.
4. **Run `state` after every page change** — after `open`, `click` (on links), `scroll`, always run `state` to see the new elements and their indices. Never guess indices.
5. **Chain commands aggressively with `&&`** — combine `open + state`, multiple `type` calls, and `type + get value` into single `&&` chains. Each tool call has overhead; chaining cuts it.
6. **`eval` is read-only** — use `eval` ONLY for data extraction (`JSON.stringify(...)`), never for clicking, typing, or navigating. Always wrap in IIFE to avoid variable conflicts: `eval "(function(){ const x = ...; return JSON.stringify(x); })()"`.
7. **Minimize total tool calls** — plan your sequence before acting. A good task completion uses 3-5 tool calls, not 15-20. Combine `open + state` as one call. Combine `type + type + click` as one call. Only run `state` separately when you need to discover new indices.
8. **Prefer `network` to discover APIs** — most sites have JSON APIs. API-based adapters are more reliable than DOM scraping.

## Command Cost Guide

| Cost | Commands | When to use |
|------|----------|-------------|
| **Free & instant** | `state`, `get *`, `eval`, `network`, `scroll`, `keys` | Default — use these |
| **Free but changes page** | `open`, `click`, `type`, `select`, `back` | Interaction — run `state` after |
| **Expensive (vision tokens)** | `screenshot` | ONLY when user needs a saved image |

## Action Chaining Rules

Commands can be chained with `&&`. The browser persists via daemon, so chaining is safe.

**Always chain when possible** — fewer tool calls = faster completion:
```bash
# GOOD: open + inspect in one call (saves 1 round trip)
opencli operate open https://example.com && opencli operate state

# GOOD: fill form in one call (saves 2 round trips)
opencli operate type 3 "hello" && opencli operate type 4 "world" && opencli operate click 7

# GOOD: type + verify in one call
opencli operate type 5 "test@example.com" && opencli operate get value 5

# GOOD: click + wait + state in one call (for page-changing clicks)
opencli operate click 12 && opencli operate wait time 1 && opencli operate state

# BAD: separate calls for each action (wasteful)
opencli operate type 3 "hello"    # Don't do this
opencli operate type 4 "world"    # when you can chain
opencli operate click 7            # all three together
```

**Page-changing — always put last** in a chain (subsequent commands see stale indices):
- `open <url>`, `back`, `click <link/button that navigates>`

**Rule**: Chain when you already know the indices. Run `state` separately when you need to discover indices first.

## Core Workflow

1. **Navigate**: `opencli operate open <url>`
2. **Inspect**: `opencli operate state` → elements with `[N]` indices
3. **Interact**: use indices — `click`, `type`,

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is opencli-operate for?

When should I use opencli-operate?

Is opencli-operate safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is opencli-operate for?

When should I use opencli-operate?

Is opencli-operate safe to install?

SKILL.md