Firecrawl Scraper

Firecrawl is an external API integration agents call during implementation—canonical shelf is Build → integrations even though the same calls support earlier research and later monitoring. The skill documents endpoint usage, auth, and failure modes for hooking a third-party scraping platform into agent workflows, which is integration work rather than app UI or pure backend logic.

Also useful

Also useful

Where it fits

Example use

Map and scrape competitor marketing pages into markdown before you commit to a product angle.

Example use

ValidateScope & plan

Pull pricing and feature copy from reference sites to sanity-check positioning on a landing-page spec.

Example use

Implement Firecrawl /scrape and /extract calls inside an agent tool that feeds your RAG or content pipeline.

Example use

GrowContent & marketing

Batch-scrape source URLs you cite so downstream content workflows stay aligned with live pages.

Example use

Run change tracking on docs or status pages you depend on so you notice drift before users do.

How it compares

Use this procedural Firecrawl integration skill instead of asking the agent to invent fetch logic or one-off BeautifulSoup scripts for every site.

Common Questions / FAQ

Who is firecrawl-scraper for?

It is for solo and indie developers using Claude Code, Cursor, or Codex who want Firecrawl scrape, crawl, map, search, extract, agent, batch, and change-tracking APIs wired into agent workflows with v2-aligned guidance.

When should I use firecrawl-scraper?

Use it when scraping or crawling websites for LLM context, running search-plus-scrape research in Idea or Validate, integrating live web data during Build, troubleshooting JS or bot-blocked pages, extracting branding references, or monitoring page changes in Operate.

Is firecrawl-scraper safe to install?

Treat it like any third-party API skill: it implies network access and API secrets. Review the Security Audits panel on this Prism page and rotate Firecrawl keys; scraping targets remain your responsibility.

SKILL.md

READMESKILL.md - Firecrawl Scraper

{
  "name": "firecrawl-scraper",
  "description": "Convert websites into LLM-ready data with Firecrawl API. Features: scrape, crawl, map, search, extract, agent (autonomous), batch operations, and change tracking. Handles JavaScript, anti-bot bypass, PDF/DOCX parsing, and branding extraction. Prevents 10 documented errors. Use when: scraping websites, crawling sites, web search + scrape, autonomous data gathering, monitoring content changes, extracting brand/design systems, or troubleshooting content not loading, JavaScript rendering, bot detect",
  "version": "1.0.0",
  "author": {
    "name": "Jeremy Dawes",
    "email": "jeremy@jezweb.net"
  },
  "license": "MIT",
  "repository": "https://github.com/jezweb/claude-skills",
  "keywords": []
}


# Firecrawl Web Scraper

**Status**: Production Ready
**Last Updated**: 2026-01-20
**API Version**: v2
**Official Docs**: https://docs.firecrawl.dev

---

## What This Skill Does

Provides complete knowledge for using Firecrawl API - a web data platform that converts websites into LLM-ready markdown or structured data. This skill covers:

- **Single page scraping** with `/scrape` endpoint
- **Full site crawling** with `/crawl` endpoint
- **URL discovery** with `/map` endpoint
- **Web search + scrape** with `/search` endpoint (NEW)
- **Structured data extraction** with `/extract` endpoint
- **Autonomous AI agent** with `/agent` endpoint (NEW)
- **Batch operations** for multiple URLs (NEW)
- **Change tracking** to monitor content changes (NEW)
- **Branding extraction** for design systems (NEW)
- **Python SDK** (firecrawl-py v4.13.0+)
- **TypeScript/Node.js SDK** (@mendable/firecrawl-js v4.11.1+)
- **JavaScript rendering**, anti-bot bypass, PDF/DOCX parsing
- **Cloudflare Workers integration** (REST API)

**Prevents common issues** with API authentication, rate limiting, timeout errors, and content extraction failures.

---

## Auto-Trigger Keywords

Claude automatically uses this skill when you mention:

### Primary Triggers (Technologies)
- firecrawl
- firecrawl api
- firecrawl-py
- firecrawl-js
- web scraping
- web crawler
- site crawler
- scrape website
- crawl website
- web scraper
- firecrawl agent
- firecrawl search

### Secondary Triggers (Use Cases)
- extract web content
- html to markdown
- convert website to markdown
- scrape documentation
- crawl documentation site
- extract structured data
- parse website
- content extraction
- web automation
- website to llm
- llm ready data
- rag from website
- scrape articles
- extract product data
- map website urls
- batch scrape
- scrape multiple urls
- monitor website changes
- track content changes
- extract brand colors
- extract design system
- autonomous web scraping
- ai web agent

### Error-Based Triggers
- "content not loading"
- "javascript rendering issues"
- "blocked by bot detection"
- "scraping blocked"
- "captcha blocking scraper"
- "dynamic content not scraping"
- "anti-bot protection"
- "scraper detected"
- "cloudflare challenge"
- "timeout scraping"
- "empty scrape result"
- "rate limit exceeded firecrawl"
- "invalid api key firecrawl"

### Framework Integration
- firecrawl cloudflare workers
- firecrawl python
- firecrawl typescript
- firecrawl node.js
- scraping with cloudflare
- serverless web scraping

---

## Known Issues Prevented

| Issue | Error Message | Prevention |
|-------|---------------|------------|
| **#1: API Key Not Set** | "Invalid API Key" | Proper environment variable setup |
| **#2: Rate Limits** | "Rate limit exceeded" | Credit optimization best practices |
| **#3: Timeout Errors** | "Request timeout" | `waitFor` and `timeout` configuration |
| **#4: Empty Content** | "Content is empty" | JS rendering with actions/wait |
| **#5: Bot Detection** | "Access denied" | Stealth mode and location options |
| **#6: Hardcoded API Keys** | Security vulnerability | Environment variable patterns |

---

## API Endpoints Overview

| Endpoint | Purpose | Use Case |
|----------|---------|----------|
| `/scrape`

What is this skill?

Covers eight operation families: /scrape, /crawl, /map, /search, /extract, /agent, batch jobs, and change tracking on AP

Documents JavaScript rendering, anti-bot bypass, and PDF/DOCX parsing plus branding or design-system extraction

Includes troubleshooting for content that does not load, bot detection, and ten documented error patterns to avoid

Supports autonomous /agent gathering and web search plus scrape for research-heavy agent tasks

Maps official Firecrawl docs so agents pick the right endpoint instead of one-off fetch hacks

Firecrawl API v2 with eight documented operation families: scrape, crawl, map, search, extract, agent, batch, and change

Documents prevention guidance for 10 named error patterns

Covers JavaScript rendering, anti-bot bypass, and PDF/DOCX parsing

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 527 installs on skills.sh; 841 GitHub stars; 2/3 security scanners passed (skills.sh audits).

Who is it for?

Solo builders whose agents must ingest public web pages, crawl small sites, or watch pages for updates using a managed Firecrawl key instead of custom scraper code.

Skip if: Builders who only need a single static file from disk, operate fully offline with no API budget, or require guaranteed compliance review of target sites without human legal judgment.

What do I get? / Deliverables

Your agent follows Firecrawl v2 endpoint patterns, auth, and documented error fixes so pages become reliable markdown or JSON—and you can re-run change tracking when content updates matter.

Agent-ready Firecrawl endpoint and parameter patterns

Structured or markdown page payloads suitable for LLM context

Change-tracking or batch scrape execution plans

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

Where it fits

Example use

Map and scrape competitor marketing pages into markdown before you commit to a product angle.

Example use

ValidateScope & plan

Pull pricing and feature copy from reference sites to sanity-check positioning on a landing-page spec.

Example use

Implement Firecrawl /scrape and /extract calls inside an agent tool that feeds your RAG or content pipeline.

Example use

GrowContent & marketing

Batch-scrape source URLs you cite so downstream content workflows stay aligned with live pages.

Example use