Open web extraction is most often installed first in Idea when validating markets and sources before a full data pipeline exists. Research subphase covers ad-hoc competitive and market intelligence gathering that does not yet belong in production ETL.

SKILL.md

READMESKILL.md - General Web Scraper

# Web Scraper — 通用网页数据抓取工具

AI agent 专用的网页数据抓取工具。输入网址和CSS选择器，自动抓取链接或表格数据，导出为CSV或JSON。

## 功能

- **链接抓取** — 抓取页面中所有匹配CSS选择器的链接
- **表格抓取** — 自动提取HTML表格数据
- **CSV导出** — 默认输出CSV格式
- **JSON导出** — 支持JSON格式输出
- **中文友好** — 完整支持中文网页编码

## 使用方式

```bash
# 抓取页面所有链接
python scraper.py https://example.com

# 自定义CSS选择器
python scraper.py https://example.com "a.article-link"

# 导出JSON格式
python scraper.py https://example.com "div.item" --json

# 抓取表格数据
python scraper.py https://example.com "table#data" --table --json
```

## 依赖安装

```bash
pip install requests beautifulsoup4
```

## 适用场景

- 数据采集和调研
- 竞品信息监控
- 市场情报收集
- 内容聚合

## Tags
scraping, web, data, python, automation, crawler, data-collection

General Web Scraper

skills.volces.com

Pull links or table data from any URL with CSS selectors and export CSV or JSON for market and competitor research.

Install

npx skills add https://github.com/skills.volces.com --skill general-web-scraper

What is this skill?

Link scraping via configurable CSS selectors (default: page anchors)
HTML table extraction with --table mode
CSV default export plus optional JSON output
Chinese encoding support for international pages
CLI: python scraper.py URL [selector] [--json] [--table]

Adoption & trust: 1 installs on skills.sh; trending (+100% hot-view momentum).

Recommended Skills

Agent Browservercel-labs/agent-browser

agent-browser is a Node-installed browser automation CLI built for AI agents that need dependable programmatic web inter…428k installs·35.5k stars

Lark Imlarksuite/cli

Lark IM is a Larksuite agent skill that exposes Feishu/Lark instant messaging to Claude Code, Cursor, and similar agents…210k installs·13.7k stars

Lark Calendarlarksuite/cli

lark-calendar is an agent skill for Feishu/Lark Calendar v4 exposed via lark-cli. Solo builders and small teams who alre…209k installs·13.7k stars

Lark Sheetslarksuite/cli

Skill for programmatic Feishu spreadsheet and worksheet management—create tables, bulk data IO, lookup, and export—using…209k installs·13.7k stars

Lark Vclarksuite/cli

lark-vc is an agent skill for Feishu/Lark video conferencing history and artifacts through lark-cli. After calls end, so…208k installs·13.7k stars

Lark Contactlarksuite/cli

CLI skill for Lark directory lookup: search employees and fetch metadata by open_id, with clear boundaries vs IM, calend…208k installs·13.7k stars

Journey fit

Primary fit

IdeaOpportunity & market research

SKILL.md

READMESKILL.md - General Web Scraper

# Web Scraper — 通用网页数据抓取工具

AI agent 专用的网页数据抓取工具。输入网址和CSS选择器，自动抓取链接或表格数据，导出为CSV或JSON。

## 功能

- **链接抓取** — 抓取页面中所有匹配CSS选择器的链接
- **表格抓取** — 自动提取HTML表格数据
- **CSV导出** — 默认输出CSV格式
- **JSON导出** — 支持JSON格式输出
- **中文友好** — 完整支持中文网页编码

## 使用方式

```bash
# 抓取页面所有链接
python scraper.py https://example.com

# 自定义CSS选择器
python scraper.py https://example.com "a.article-link"

# 导出JSON格式
python scraper.py https://example.com "div.item" --json

# 抓取表格数据
python scraper.py https://example.com "table#data" --table --json
```

## 依赖安装

```bash
pip install requests beautifulsoup4
```

## 适用场景

- 数据采集和调研
- 竞品信息监控
- 市场情报收集
- 内容聚合

## Tags
scraping, web, data, python, automation, crawler, data-collection

Install

What is this skill?

Recommended Skills

Journey fit

SKILL.md

This week for builders

Install

What is this skill?

Recommended Skills

Journey fit

SKILL.md