
Webclaw
Give Claude Code or Cursor a stdio MCP server that turns any public URL into clean markdown for competitor research, docs ingestion, and content reuse without hand-copying pages.
Overview
io.github.0xMassi/webclaw is an Idea-phase MCP server for web extraction that scrapes, crawls, extracts, and summarizes any URL into clean markdown for coding agents.
What is this MCP server?
- Scrape a single URL to structured, agent-friendly markdown
- Crawl linked pages for deeper site captures when one page is not enough
- Extract and summarize page content so agents get signal instead of raw HTML noise
- Stdio MCP transport via npm package create-webclaw (v0.1.3) for Claude Code, Cursor, and other MCP hosts
- Fits research, validation, and launch workflows that start from someone else’s website
- Server schema version 0.1.3
- npm registry identifier create-webclaw with stdio MCP transport
- Capabilities described as scrape, crawl, extract, and summarize to markdown
What problem does it solve?
Solo builders waste agent turns copying messy HTML or fighting one-off scrapers when they need readable web content in the chat.
Who is it for?
Indie builders doing competitor and audience research, ingesting public docs, or summarizing marketing pages inside Claude Code, Cursor, or Codex with stdio MCP.
Skip if: Teams that need authenticated sessions, heavy anti-bot bypass, or a full managed crawling platform with SLAs and compliance review.
What do I get? / Deliverables
After you add webclaw to your MCP host, agents can pull normalized markdown from target URLs so research and spec work stays in one thread.
- Clean markdown representations of single URLs or crawled pages
- Summarized extractions suitable for notes, specs, and agent follow-up
- Repeatable web capture workflows without bespoke scraper scripts per site
Recommended MCP Servers
Journey fit
Arbitrary URL extraction is most often the first web-automation need in the solo journey—before you have a product URL of your own—when you are reading competitors, docs, and landing pages during opportunity research. The research subphase is where builders collect and normalize off-site information; markdown output maps directly into notes, specs, and agent context.
How it compares
stdio MCP web-scraping integration, not an in-repo agent skill or curated skills marketplace.
Common Questions / FAQ
Who is Webclaw for?
Solo and indie builders who use MCP-enabled coding agents and need URL-to-markdown extraction for research, validation, and content tasks without custom scraper code.
When should I use Webclaw?
Use it when an answer depends on a live web page—competitor sites, docs, or articles—and you want crawl, extract, or summarize output as clean markdown in the agent session.
How do I add Webclaw to my agent?
Install the npm package create-Webclaw (registry identifier in server metadata), add an MCP server entry with stdio transport in Claude Code, Cursor, or your host’s config, then restart the client so tools load.