Browser Automation

Browser control is wired into the product and agent toolchain during implementation, when you need repeatable web flows and authenticated sessions. Playwright handlers and CDP/Chromium hooks are third-party automation integrations the agent invokes from the repo skill package.

Also useful

Also useful

Where it fits

Example use

Walk through a competitor signup funnel with handleNavigate and handleFill before you commit to your own UX.

Example use

Script admin-console clicks and uploads after deploying your SaaS backend.

Example use

Capture handleScreenshot evidence of critical paths on staging ahead of launch.

How it compares

Use this Playwright skill package for persistent sessions and rich DOM ops—not ad-hoc built-in browser snippets with no cookie profile.

Common Questions / FAQ

Who is browser-automation for?

Indie and solo developers using OpenClaw-style agent stacks who want Playwright-grade control when users ask for webpage opens, clicks, forms, or screenshots in chat.

When should I use browser-automation?

During Validate when you prototype a checkout or onboarding flow in a real browser; during Build when you integrate agent-driven ops against admin UIs; during Ship when you capture screenshots or regression evidence before release.

Is browser-automation safe to install?

It runs local Node against real browsers and can execute page JavaScript—review the Security Audits panel on this Prism page and inspect `index.js` before pointing it at production accounts.

SKILL.md

READMESKILL.md - Browser Automation

# 浏览器自动化 Skill

让 OpenClaw 控制浏览器进行自动化操作！

## 特点

- **复用页面**：默认复用现有页面，不会每次都打开新窗口
- **持久化 Cookie**：使用独立的用户数据目录，登录状态持久保存
- **完整操作**：支持点击、填表、截图、执行 JS 等

## 功能

- 导航到 URL
- 点击元素
- 填写表单
- 截图
- 获取页面内容
- 等待元素
- 执行 JavaScript

## 使用时机

当用户说：
- "打开 xxx.com"
- "点击登录按钮"
- "帮我填写这个表单"
- "截个图"
- "网页上有什么内容"

## 可用函数

调用方式: `cd C:/Users/admin/.openclaw/skills/browser-automation && node -e "const h=require('./index.js'); h.handleXXX({...}).then(console.log)"`

### 页面操作
- `handleNavigate({url})` - 导航到 URL（复用现有页面）
- `handleNewPage({url})` - 打开新页面
- `handleScreenshot({selector?, fullPage?})` - 截图
- `handleGetContent({selector?})` - 获取页面内容
- `handleClose()` - 关闭当前页面

### 交互操作
- `handleClick({selector})` - 点击元素
- `handleFill({selector, value, clear?})` - 填写表单
- `handleType({selector, text, delay?})` - 模拟打字
- `handleSelect({selector, value})` - 下拉选择
- `handleCheck({selector, checked?})` - 勾选/取消勾选

### 等待和获取
- `handleWait({selector, timeout?})` - 等待元素出现
- `handleWaitForNavigation({timeout?})` - 等待页面跳转
- `handleGetText({selector})` - 获取元素文本
- `handleGetValue({selector})` - 获取表单值
- `handleGetAttribute({selector, attribute})` - 获取属性

### 高级操作
- `handleEvaluate({script})` - 执行 JavaScript
- `handleUpload({selector, filePath})` - 上传文件
- `handlePress({key})` - 按键
- `handleHover({selector})` - 鼠标悬停
- `handleScroll({direction, amount?})` - 滚动页面

### 状态
- `handleStatus()` - 获取当前浏览器状态
- `handleCloseBrowser()` - 关闭浏览器（下次会重新启动）

## 选择器语法

支持 CSS 选择器和文本选择器：
- CSS: `#login-btn`, `.submit-button`, `input[name="email"]`
- 文本: `text=登录`, `text=提交`
- 组合: `button:has-text("提交")`

## 示例

```
用户: 打开 github.com
Agent: [调用 handleNavigate({url: 'https://github.com'})] 已打开 GitHub...

用户: 点击登录
Agent: [调用 handleClick({selector: 'text=Sign in'})] 已点击登录...

用户: 填写邮箱 test@example.com
Agent: [调用 handleFill({selector: '#login_field', value: 'test@example.com'})] 已填写邮箱...

用户: 截个图看看
Agent: [调用 handleScreenshot()] [返回截图]
```


node_modules/
*.log
.DS_Store
Thumbs.db


{
  "name": "browser-automation",
  "version": "1.0.0",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "browser-automation",
      "version": "1.0.0",
      "dependencies": {
        "playwright": "^1.48.0"
      }
    },
    "node_modules/fsevents": {
      "version": "2.3.2",
      "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz",
      "integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==",
      "hasInstallScript": true,
      "license": "MIT",
      "optional": true,
      "os": [
        "darwin"
      ],
      "engines": {
        "node": "^8.16.0 || ^10.6.0 || >=11.0.0"
      }
    },
    "node_modules/playwright": {
      "version": "1.58.2",
      "resolved": "https://registry.npmjs.org/playwright/-/playwright-1.58.2.tgz",
      "integrity": "sha512-vA30H8Nvkq/cPBnNw4Q8TWz1EJyqgpuinBcHET0YVJVFldr8JDNiU9LaWAE1KqSkRYazuaBhTpB5ZzShOezQ6A==",
      "license": "Apache-2.0",
      "dependencies": {
        "playwright-core": "1.58.2"
      },
      "bin": {
        "playwright": "cli.js"
      },
      "engines": {
        "node": ">=18"
      },
      "optionalDependencies": {
        "fsevents": "2.3.2"
      }
    },
    "node_modules/playwright-core": {
      "version": "1.58.2",
      "resolved": "https://registry.npmjs.org/playwright-core/-/playwright-core-1.58.2.tgz",
      "integrity": "sha512-yZkEtftgwS8CsfYo7nm0KE8jsvm6i/PTgVtB8DL726wNf6H2IMsDuxCpJj59KDaxCtSnrWan2AeDqM7JBaultg==",
      "license": "Apache-2.0",
      "bin": {
        "playwright-core": "cli.js"
      },
      "engines": {
        "node": ">=18"
      }
    }
  }
}


# Browser Automation Skill Tools

## 概述

这个 skill 使用 Playwright 控制浏览器进行自动化操作。

**特点**：
- 默认复用现有页面，不打开新窗口
- Cookie 和登录状态持久化保存
- 支持所有常见浏览器操作

## 使用方法

调用方式: `cd C:/Users/admin/.opencla

What is this skill?

Playwright-based navigation, click, fill, type, select, check, upload, hover, scroll, and key press via named handle* fu

Reuses an existing tab by default and keeps cookies in a dedicated user-data directory for persistent logins

CDP attach to an open Chrome or launch a fresh Chromium instance

Screenshot (full page or selector), wait for elements/navigation, get text/value/attributes, and evaluate arbitrary Java

Documented trigger: prefer this skill over the agent’s built-in browser tool when users ask for 打开网页, 点击, 填表, or 截图

Documented handler groups cover page ops, interaction, wait/get, and advanced actions (evaluate, upload, hover, scroll).

Default behavior reuses an existing page instead of opening a new window each invocation.

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 2.8k installs on skills.sh; 1/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

Your agent reuses a controlled Chromium or CDP Chrome page, completes the requested web actions, and returns content or screenshots from the skill’s handle API.

Executed browser actions with returned page content, element text/values, or image screenshots

Persistent session state via the skill’s user-data directory when re-run

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

Where it fits

Example use

Walk through a competitor signup funnel with handleNavigate and handleFill before you commit to your own UX.

Example use

Script admin-console clicks and uploads after deploying your SaaS backend.

Example use