
Firecrawl Scraper
Wire Firecrawl’s scrape, crawl, search, extract, and change-tracking APIs into Claude Code or Cursor so solo builders can pull LLM-ready web data without bespoke scrapers.
Overview
firecrawl-scraper is an agent skill most often used in Build (also Idea research and Operate monitoring) that documents how to call the Firecrawl API for scrape, crawl, search, extract, agent, batch, and change-tracking
Install
npx skills add https://github.com/jezweb/claude-skills --skill firecrawl-scraperWhat is this skill?
- Covers eight operation families: /scrape, /crawl, /map, /search, /extract, /agent, batch jobs, and change tracking on AP
- Documents JavaScript rendering, anti-bot bypass, and PDF/DOCX parsing plus branding or design-system extraction
- Includes troubleshooting for content that does not load, bot detection, and ten documented error patterns to avoid
- Supports autonomous /agent gathering and web search plus scrape for research-heavy agent tasks
- Maps official Firecrawl docs so agents pick the right endpoint instead of one-off fetch hacks
- Firecrawl API v2 with eight documented operation families: scrape, crawl, map, search, extract, agent, batch, and change
- Documents prevention guidance for 10 named error patterns
- Covers JavaScript rendering, anti-bot bypass, and PDF/DOCX parsing
Adoption & trust: 527 installs on skills.sh; 841 GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need clean, LLM-ready text or structured fields from real websites, but raw HTTP requests break on JavaScript, bots, PDFs, and shifting page layouts.
Who is it for?
Solo builders whose agents must ingest public web pages, crawl small sites, or watch pages for updates using a managed Firecrawl key instead of custom scraper code.
Skip if: Builders who only need a single static file from disk, operate fully offline with no API budget, or require guaranteed compliance review of target sites without human legal judgment.
When should I use this skill?
Use when scraping websites, crawling sites, web search plus scrape, autonomous data gathering, monitoring content changes, extracting brand or design systems, or troubleshooting content not loading, JavaScript rendering,
What do I get? / Deliverables
Your agent follows Firecrawl v2 endpoint patterns, auth, and documented error fixes so pages become reliable markdown or JSON—and you can re-run change tracking when content updates matter.
- Agent-ready Firecrawl endpoint and parameter patterns
- Structured or markdown page payloads suitable for LLM context
- Change-tracking or batch scrape execution plans
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Firecrawl is an external API integration agents call during implementation—canonical shelf is Build → integrations even though the same calls support earlier research and later monitoring. The skill documents endpoint usage, auth, and failure modes for hooking a third-party scraping platform into agent workflows, which is integration work rather than app UI or pure backend logic.
Where it fits
Map and scrape competitor marketing pages into markdown before you commit to a product angle.
Pull pricing and feature copy from reference sites to sanity-check positioning on a landing-page spec.
Implement Firecrawl /scrape and /extract calls inside an agent tool that feeds your RAG or content pipeline.
Batch-scrape source URLs you cite so downstream content workflows stay aligned with live pages.
Run change tracking on docs or status pages you depend on so you notice drift before users do.
How it compares
Use this procedural Firecrawl integration skill instead of asking the agent to invent fetch logic or one-off BeautifulSoup scripts for every site.
Common Questions / FAQ
Who is firecrawl-scraper for?
It is for solo and indie developers using Claude Code, Cursor, or Codex who want Firecrawl scrape, crawl, map, search, extract, agent, batch, and change-tracking APIs wired into agent workflows with v2-aligned guidance.
When should I use firecrawl-scraper?
Use it when scraping or crawling websites for LLM context, running search-plus-scrape research in Idea or Validate, integrating live web data during Build, troubleshooting JS or bot-blocked pages, extracting branding references, or monitoring page changes in Operate.
Is firecrawl-scraper safe to install?
Treat it like any third-party API skill: it implies network access and API secrets. Review the Security Audits panel on this Prism page and rotate Firecrawl keys; scraping targets remain your responsibility.
SKILL.md
READMESKILL.md - Firecrawl Scraper
{ "name": "firecrawl-scraper", "description": "Convert websites into LLM-ready data with Firecrawl API. Features: scrape, crawl, map, search, extract, agent (autonomous), batch operations, and change tracking. Handles JavaScript, anti-bot bypass, PDF/DOCX parsing, and branding extraction. Prevents 10 documented errors. Use when: scraping websites, crawling sites, web search + scrape, autonomous data gathering, monitoring content changes, extracting brand/design systems, or troubleshooting content not loading, JavaScript rendering, bot detect", "version": "1.0.0", "author": { "name": "Jeremy Dawes", "email": "jeremy@jezweb.net" }, "license": "MIT", "repository": "https://github.com/jezweb/claude-skills", "keywords": [] } # Firecrawl Web Scraper **Status**: Production Ready **Last Updated**: 2026-01-20 **API Version**: v2 **Official Docs**: https://docs.firecrawl.dev --- ## What This Skill Does Provides complete knowledge for using Firecrawl API - a web data platform that converts websites into LLM-ready markdown or structured data. This skill covers: - **Single page scraping** with `/scrape` endpoint - **Full site crawling** with `/crawl` endpoint - **URL discovery** with `/map` endpoint - **Web search + scrape** with `/search` endpoint (NEW) - **Structured data extraction** with `/extract` endpoint - **Autonomous AI agent** with `/agent` endpoint (NEW) - **Batch operations** for multiple URLs (NEW) - **Change tracking** to monitor content changes (NEW) - **Branding extraction** for design systems (NEW) - **Python SDK** (firecrawl-py v4.13.0+) - **TypeScript/Node.js SDK** (@mendable/firecrawl-js v4.11.1+) - **JavaScript rendering**, anti-bot bypass, PDF/DOCX parsing - **Cloudflare Workers integration** (REST API) **Prevents common issues** with API authentication, rate limiting, timeout errors, and content extraction failures. --- ## Auto-Trigger Keywords Claude automatically uses this skill when you mention: ### Primary Triggers (Technologies) - firecrawl - firecrawl api - firecrawl-py - firecrawl-js - web scraping - web crawler - site crawler - scrape website - crawl website - web scraper - firecrawl agent - firecrawl search ### Secondary Triggers (Use Cases) - extract web content - html to markdown - convert website to markdown - scrape documentation - crawl documentation site - extract structured data - parse website - content extraction - web automation - website to llm - llm ready data - rag from website - scrape articles - extract product data - map website urls - batch scrape - scrape multiple urls - monitor website changes - track content changes - extract brand colors - extract design system - autonomous web scraping - ai web agent ### Error-Based Triggers - "content not loading" - "javascript rendering issues" - "blocked by bot detection" - "scraping blocked" - "captcha blocking scraper" - "dynamic content not scraping" - "anti-bot protection" - "scraper detected" - "cloudflare challenge" - "timeout scraping" - "empty scrape result" - "rate limit exceeded firecrawl" - "invalid api key firecrawl" ### Framework Integration - firecrawl cloudflare workers - firecrawl python - firecrawl typescript - firecrawl node.js - scraping with cloudflare - serverless web scraping --- ## Known Issues Prevented | Issue | Error Message | Prevention | |-------|---------------|------------| | **#1: API Key Not Set** | "Invalid API Key" | Proper environment variable setup | | **#2: Rate Limits** | "Rate limit exceeded" | Credit optimization best practices | | **#3: Timeout Errors** | "Request timeout" | `waitFor` and `timeout` configuration | | **#4: Empty Content** | "Content is empty" | JS rendering with actions/wait | | **#5: Bot Detection** | "Access denied" | Stealth mode and location options | | **#6: Hardcoded API Keys** | Security vulnerability | Environment variable patterns | --- ## API Endpoints Overview | Endpoint | Purpose | Use Case | |----------|---------|----------| | `/scrape`