ShadowCrawl

Name: ShadowCrawl
Author: DevsHero

DevsHero/ShadowCrawl

Let your coding agent search and scrape the live web when normal fetchers hit Cloudflare, DataDome, or bot checks.

Overview

ShadowCrawl is an MCP server for the Idea phase that gives coding agents stealth web search and scraping with CDP fallback and human-in-the-loop when bots are blocked.

What is this MCP server?

Rust MCP server with stealth scraping and search-oriented tools for agent workflows
Anti-bot bypass patterns with Chrome DevTools Protocol (CDP) fallback when automation is blocked
Human-in-the-loop (HITL) path when a page is non-robot or needs manual clearance
Designed as MCP tools so Claude Code, Cursor, and similar agents invoke scrape/search without custom scripts
Version 2.3.0 server schema entry tied to the DevsHero/ShadowCrawl GitHub repository
Server schema version 2.3.0 in catalog metadata
Documented capabilities: stealth search/scrape, CDP fallback, HITL for non-robot pages

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Community signal: 66 GitHub stars.

What problem does it solve?

Your agent cannot pull live competitor or market pages because Cloudflare, DataDome, and similar defenses return empty or challenge pages to normal scrapers.

Who is it for?

Solo builders running Claude Code or Cursor who need agent-driven web research on anti-bot sites during idea and validate work.

Skip if: Teams that only need sitemap-friendly public APIs, or anyone who wants a managed compliance-first enterprise crawler with legal review built in.

What do I get? / Deliverables

After you wire ShadowCrawl into MCP, agents can request protected web content through stealth tooling and escalate to CDP or HITL instead of stopping research cold.

MCP tools that return search results or page content from defended sites
Agent workflows that chain automated scrape, CDP retry, and HITL clearance

Recommended MCP Servers

1stDibs

The 1stDibs MCP server exposes browse-and-search capabilities against the 1stDibs luxury goods marketplace through a hos…

2Captcha MCParuxojuyu665/2Captcha-MCP

2Captcha MCP exposes the commercial 2Captcha API to MCP hosts with 43 tools—31 focused on captcha solving plus managemen…

4fetch

4fetch is a hosted MCP server that fetches a URL and returns clean Markdown with metadata so coding agents can quote pag…

AcrawlMingye-Lu/AgenticCrawler

acrawl (Agentic Crawler) is a Model Context Protocol server that packages autonomous web browsing into a single local bi…5 stars

Agentfetchbch1212/agentfetch-mcp

Agentfetch MCP is a token-budgeted web retrieval server for AI coding agents. Solo builders doing idea-phase competitor …

AgenticTotem Web Extractor

AgenticTotem Web Extractor is a hosted MCP server for AI web extraction: you supply URLs and a JSON Schema, and the serv…

Journey fit

Primary fit

IdeaOpportunity & market research

Solo builders usually need protected-site data earliest while validating ideas and sizing markets before they commit to a full build. Research subphase is where competitor pages, pricing tables, and audience signals are gathered from sites that block simple HTTP clients.

How it compares

MCP scraping integration, not a no-code marketing scraper or a generic filesystem skill.

Common Questions / FAQ

Who is ShadowCrawl for?

ShadowCrawl is for indie developers and agent users who want MCP-native stealth search and scrape tools when standard HTTP fetch fails on protected sites.

When should I use ShadowCrawl?

Use it during market and competitor research, or anytime your coding agent must read pages behind bot detection and you are willing to handle site policy yourself.

How do I add ShadowCrawl to my agent?

Install or build the ShadowCrawl MCP server from the DevsHero repository, add it to your client’s MCP server config (stdio or your transport), restart the agent, and call the server’s scrape or search tools from the tool palette.

What is this MCP server?

Rust MCP server with stealth scraping and search-oriented tools for agent workflows

Anti-bot bypass patterns with Chrome DevTools Protocol (CDP) fallback when automation is blocked

Human-in-the-loop (HITL) path when a page is non-robot or needs manual clearance

Designed as MCP tools so Claude Code, Cursor, and similar agents invoke scrape/search without custom scripts

Version 2.3.0 server schema entry tied to the DevsHero/ShadowCrawl GitHub repository

Server schema version 2.3.0 in catalog metadata

Documented capabilities: stealth search/scrape, CDP fallback, HITL for non-robot pages

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Community signal: 66 GitHub stars.

What do I get? / Deliverables

After you wire ShadowCrawl into MCP, agents can request protected web content through stealth tooling and escalate to CDP or HITL instead of stopping research cold.

MCP tools that return search results or page content from defended sites

Agent workflows that chain automated scrape, CDP retry, and HITL clearance

Journey fit