
Firecrawl Scraper
Scrape, crawl, map, or batch-fetch web pages into agent-ready content via Firecrawl without wiring REST calls by hand.
Overview
Firecrawl Scraper is an agent skill for the Idea phase (also Build integrations) that runs Firecrawl scrape, crawl, map, and batch-scrape jobs through a Node CLI helper.
Install
npx skills add https://github.com/benedictking/firecrawl-scraper --skill firecrawl-scraperWhat is this skill?
- CLI wrapper around Firecrawl scrape, crawl, map, batch-scrape, and crawl-status endpoints
- Accepts JSON via argv, stdin, or --file payloads for repeatable agent workflows
- Optional --wait on long-running crawl jobs with status polling by crawl id
- Loads FIRECRAWL_API_KEY from environment or a local .env next to the helper script
- 5 CLI subcommands: scrape, crawl, map, batch-scrape, crawl-status
Adoption & trust: 512 installs on skills.sh; 9 GitHub stars; 1/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need structured page content from the live web inside an agent workflow and do not want to hand-write Firecrawl REST clients for every task.
Who is it for?
Solo builders who already have or will create a Firecrawl API key and want repeatable scrape-and-crawl steps inside Claude Code, Cursor, or Codex.
Skip if: Teams that need a no-code scraper UI only, cannot store API secrets locally, or want deep custom parsing logic without post-processing the returned markdown or HTML.
When should I use this skill?
When you need to scrape, crawl, map, or batch-process URLs through Firecrawl from an agent session and have or will configure FIRECRAWL_API_KEY.
What do I get? / Deliverables
Your agent can call standardized Firecrawl operations with JSON payloads and get crawl results or status back for downstream research, docs, or indexing steps.
- JSON or text API responses from Firecrawl operations
- Completed crawl or scrape payloads for downstream summarization or storage
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Competitive and market research usually starts in Idea, but the same integration supports content pipelines later. Canonical shelf is research because solo builders install Firecrawl first to harvest URLs and site structure before they commit to a build.
Where it fits
Crawl three competitor pricing pages into markdown before you choose a positioning angle.
Scrape a reference app's marketing copy to seed a landing-page prototype.
Map and batch-scrape documentation URLs to refresh an in-app help sidebar.
Pull updated blog posts from partner sites for a weekly digest workflow.
How it compares
Use this skill for Firecrawl API orchestration from the agent—not for writing one-off curl commands or building a separate Python Scrapy project.
Common Questions / FAQ
Who is firecrawl-scraper for?
It is for solo and indie builders using AI coding agents who need Firecrawl-powered scraping and crawling as part of research, content, or integration workflows.
When should I use firecrawl-scraper?
Use it during Idea research to crawl competitor sites, during Validate when you prototype pages from live references, and during Build when you integrate external web content into your app or docs pipeline.
Is firecrawl-scraper safe to install?
Review the Security Audits panel on this Prism page, treat your Firecrawl API key as a secret, and avoid pointing crawls at sensitive or authenticated pages you are not authorized to access.
SKILL.md
READMESKILL.md - Firecrawl Scraper
# Firecrawl API Key Configuration # Get your API key from: https://www.firecrawl.dev/app/api-keys FIRECRAWL_API_KEY=your_api_key_here .env node_modules/ #!/usr/bin/env node /** * Firecrawl API Helper Script * Provides a CLI wrapper around Firecrawl endpoints for skill integration. * * Usage: * node firecrawl-api.js <scrape|crawl|map|batch-scrape|crawl-status> [<json-string>] * cat payload.json | node firecrawl-api.js scrape * node firecrawl-api.js scrape --file ./payload.json * node firecrawl-api.js crawl --wait < payload.json * node firecrawl-api.js crawl-status <crawl-id> [--wait] */ const https = require('https'); const fs = require('fs'); const path = require('path'); const API_BASE = 'https://api.firecrawl.dev'; function loadApiKey() { if (process.env.FIRECRAWL_API_KEY) { return process.env.FIRECRAWL_API_KEY; } const envPath = path.join(__dirname, '.env'); if (!fs.existsSync(envPath)) { return null; } const envContent = fs.readFileSync(envPath, 'utf8'); const match = envContent.match(/FIRECRAWL_API_KEY\s*=\s*(.+)/); if (!match) { return null; } return match[1].trim().replace(/^[\"']|[\"']$/g, ''); } function usage() { const cmd = path.basename(process.argv[1] || 'firecrawl-api.js'); console.error( [ 'Usage:', ` node ${cmd} <scrape|crawl|map|batch-scrape|crawl-status> [<json-string>]`, ` cat payload.json | node ${cmd} scrape`, ` node ${cmd} scrape --file ./payload.json`, ` node ${cmd} crawl --wait < payload.json`, ` node ${cmd} crawl-status <crawl-id> [--wait]`, '', 'Options:', ' --wait Wait for crawl job completion (crawl / crawl-status only)', ' --id Crawl job id (crawl-status only)', '', 'Env:', ' FIRECRAWL_API_KEY (env var) or .env file next to this script', ].join('\n'), ); } function readStdin() { return new Promise((resolve, reject) => { let data = ''; process.stdin.setEncoding('utf8'); process.stdin.on('data', (chunk) => { data += chunk; }); process.stdin.on('end', () => resolve(data)); process.stdin.on('error', reject); }); } async function readPayload(args) { const fileFlagIndex = args.findIndex((arg) => arg === '--file'); if (fileFlagIndex !== -1) { const filePath = args[fileFlagIndex + 1]; if (!filePath) { throw new Error('Missing value for --file'); } const content = fs.readFileSync(filePath, 'utf8'); return JSON.parse(content); } const dataFlagIndex = args.findIndex((arg) => arg === '--data'); if (dataFlagIndex !== -1) { const json = args[dataFlagIndex + 1]; if (!json) { throw new Error('Missing value for --data'); } return JSON.parse(json); } if (args[0] && !args[0].startsWith('-')) { return JSON.parse(args[0]); } if (process.stdin.isTTY) { throw new Error('No payload provided (pass JSON arg, --data, --file, or pipe via stdin)'); } const stdin = await readStdin(); if (!stdin.trim()) { throw new Error('Empty stdin payload'); } return JSON.parse(stdin); } function requestJson(method, endpointPath, apiKey, payload) { return new Promise((resolve, reject) => { const body = payload === undefined ? null : JSON.stringify(payload); const url = new URL(endpointPath, API_BASE); const req = https.request( url, { method, headers: { 'Authorization': `Bearer ${apiKey}`, ...(body === null ? {} : { 'Content-Type': 'application/json', 'Content-Length': Buffer.byteLength(body), }), 'User-Agent': 'Firecrawl-Skill/1.0', }, timeout: 60_000, }, (res) => { let data = ''; res.setEncoding('utf8'); res.on('data', (chunk) => { data += chunk; }); res.on('end', () => { const ok = res.statusCode && res.statusCode >= 200