
General Web Scraper
Pull links or table data from any URL with CSS selectors and export CSV or JSON for market and competitor research.
Install
npx skills add https://github.com/skills.volces.com --skill general-web-scraperWhat is this skill?
- Link scraping via configurable CSS selectors (default: page anchors)
- HTML table extraction with --table mode
- CSV default export plus optional JSON output
- Chinese encoding support for international pages
- CLI: python scraper.py URL [selector] [--json] [--table]
Adoption & trust: 1 installs on skills.sh; trending (+100% hot-view momentum).
Recommended Skills
Agent Browservercel-labs/agent-browser
Lark Imlarksuite/cli
Lark Calendarlarksuite/cli
Lark Sheetslarksuite/cli
Lark Vclarksuite/cli
Lark Contactlarksuite/cli
Journey fit
Primary fit
Open web extraction is most often installed first in Idea when validating markets and sources before a full data pipeline exists. Research subphase covers ad-hoc competitive and market intelligence gathering that does not yet belong in production ETL.
SKILL.md
READMESKILL.md - General Web Scraper
# Web Scraper — 通用网页数据抓取工具 AI agent 专用的网页数据抓取工具。输入网址和CSS选择器,自动抓取链接或表格数据,导出为CSV或JSON。 ## 功能 - **链接抓取** — 抓取页面中所有匹配CSS选择器的链接 - **表格抓取** — 自动提取HTML表格数据 - **CSV导出** — 默认输出CSV格式 - **JSON导出** — 支持JSON格式输出 - **中文友好** — 完整支持中文网页编码 ## 使用方式 ```bash # 抓取页面所有链接 python scraper.py https://example.com # 自定义CSS选择器 python scraper.py https://example.com "a.article-link" # 导出JSON格式 python scraper.py https://example.com "div.item" --json # 抓取表格数据 python scraper.py https://example.com "table#data" --table --json ``` ## 依赖安装 ```bash pip install requests beautifulsoup4 ``` ## 适用场景 - 数据采集和调研 - 竞品信息监控 - 市场情报收集 - 内容聚合 ## Tags scraping, web, data, python, automation, crawler, data-collection