
Nutrient Document Processing
Convert, OCR, extract, redact, watermark, sign, and fill PDFs and office files through the Nutrient DWS Processor API from your agent.
Overview
nutrient-document-processing is an agent skill most often used in Build (also Ship, Launch) that processes and converts office documents via the Nutrient DWS API.
Install
npx skills add https://github.com/affaan-m/everything-claude-code --skill nutrient-document-processingWhat is this skill?
- Nutrient DWS build endpoint for PDF, DOCX, XLSX, PPTX, HTML, and images
- Convert, OCR, table/text extraction, PII redaction, watermarks, digital sign, and PDF form fill
- Multipart POST with instructions JSON; NUTRIENT_API_KEY from nutrient.io dashboard
- Origin ECC—commercial API with explicit terms review in SKILL.md
- curl examples for DOCX↔PDF and HTML→PDF patterns agents can adapt
Adoption & trust: 4.9k installs on skills.sh; 210k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need reliable PDF and office conversions, OCR, or redaction in an agent workflow but juggling one-off desktop apps does not compose with CI or coding agents.
Who is it for?
Solo builders automating contracts, invoices, onboarding PDFs, or scraped HTML exports inside Claude Code or Cursor.
Skip if: Fully offline or air-gapped document editing without a Nutrient API key and network access.
When should I use this skill?
User needs to process, convert, OCR, extract, redact, sign, or fill PDFs and office formats via Nutrient DWS API.
What do I get? / Deliverables
You produce converted, extracted, redacted, signed, or form-filled document binaries via a consistent API contract your agent can script.
- Converted or processed document binary
- OCR or extraction output per instructions JSON
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Document pipelines most often land while building deliverables and internal docs, even though the same skill supports ship-time redaction and launch collateral. Docs subphase is the canonical shelf for format conversion and extraction that feed READMEs, contracts, and customer-facing PDFs during product build.
Where it fits
Convert a spec DOCX export to PDF for README attachments in the repo.
Redact PII from support ticket PDFs before sharing in a security review.
Render marketing HTML to a signed PDF for a Product Hunt asset bundle.
How it compares
Use as a hosted document processor integration, not a local-only pandoc wrapper or general MCP filesystem skill.
Common Questions / FAQ
Who is nutrient-document-processing for?
Developers and solo founders who want agent-driven PDF and office document operations through Nutrient's cloud Processor API.
When should I use nutrient-document-processing?
During Build docs for format conversion and extraction; during Ship security for PII redaction; during Launch distribution when turning HTML or DOCX into release PDFs.
Is nutrient-document-processing safe to install?
It sends files to a third-party API; review Nutrient terms, minimize sensitive uploads, and check the Security Audits panel on this Prism page before production use.
SKILL.md
READMESKILL.md - Nutrient Document Processing
# Nutrient Document Processing > **Note:** This skill integrates with the Nutrient commercial API. Review their terms before use. Process documents with the [Nutrient DWS Processor API](https://www.nutrient.io/api/). Convert formats, extract text and tables, OCR scanned documents, redact PII, add watermarks, digitally sign, and fill PDF forms. ## Setup Get a free API key at **[nutrient.io](https://dashboard.nutrient.io/sign_up/?product=processor)** ```bash export NUTRIENT_API_KEY="pdf_live_..." ``` All requests go to `https://api.nutrient.io/build` as multipart POST with an `instructions` JSON field. ## Operations ### Convert Documents ```bash # DOCX to PDF curl -X POST https://api.nutrient.io/build \ -H "Authorization: Bearer $NUTRIENT_API_KEY" \ -F "document.docx=@document.docx" \ -F 'instructions={"parts":[{"file":"document.docx"}]}' \ -o output.pdf # PDF to DOCX curl -X POST https://api.nutrient.io/build \ -H "Authorization: Bearer $NUTRIENT_API_KEY" \ -F "document.pdf=@document.pdf" \ -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"docx"}}' \ -o output.docx # HTML to PDF curl -X POST https://api.nutrient.io/build \ -H "Authorization: Bearer $NUTRIENT_API_KEY" \ -F "index.html=@index.html" \ -F 'instructions={"parts":[{"html":"index.html"}]}' \ -o output.pdf ``` Supported inputs: PDF, DOCX, XLSX, PPTX, DOC, XLS, PPT, PPS, PPSX, ODT, RTF, HTML, JPG, PNG, TIFF, HEIC, GIF, WebP, SVG, TGA, EPS. ### Extract Text and Data ```bash # Extract plain text curl -X POST https://api.nutrient.io/build \ -H "Authorization: Bearer $NUTRIENT_API_KEY" \ -F "document.pdf=@document.pdf" \ -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"text"}}' \ -o output.txt # Extract tables as Excel curl -X POST https://api.nutrient.io/build \ -H "Authorization: Bearer $NUTRIENT_API_KEY" \ -F "document.pdf=@document.pdf" \ -F 'instructions={"parts":[{"file":"document.pdf"}],"output":{"type":"xlsx"}}' \ -o tables.xlsx ``` ### OCR Scanned Documents ```bash # OCR to searchable PDF (supports 100+ languages) curl -X POST https://api.nutrient.io/build \ -H "Authorization: Bearer $NUTRIENT_API_KEY" \ -F "scanned.pdf=@scanned.pdf" \ -F 'instructions={"parts":[{"file":"scanned.pdf"}],"actions":[{"type":"ocr","language":"english"}]}' \ -o searchable.pdf ``` Languages: Supports 100+ languages via ISO 639-2 codes (e.g., `eng`, `deu`, `fra`, `spa`, `jpn`, `kor`, `chi_sim`, `chi_tra`, `ara`, `hin`, `rus`). Full language names like `english` or `german` also work. See the [complete OCR language table](https://www.nutrient.io/guides/document-engine/ocr/language-support/) for all supported codes. ### Redact Sensitive Information ```bash # Pattern-based (SSN, email) curl -X POST https://api.nutrient.io/build \ -H "Authorization: Bearer $NUTRIENT_API_KEY" \ -F "document.pdf=@document.pdf" \ -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"social-security-number"}},{"type":"redaction","strategy":"preset","strategyOptions":{"preset":"email-address"}}]}' \ -o redacted.pdf # Regex-based curl -X POST https://api.nutrient.io/build \ -H "Authorization: Bearer $NUTRIENT_API_KEY" \ -F "document.pdf=@document.pdf" \ -F 'instructions={"parts":[{"file":"document.pdf"}],"actions":[{"type":"redaction","strategy":"regex","strategyOptions":{"regex":"\\b[A-Z]{2}\\d{6}\\b"}}]}' \ -o redacted.pdf ``` Presets: `social-security-number`, `email-address`, `credit-card-number`, `international-phone-number`, `north-american-phone-number`, `date`, `time`, `url`, `ipv4`, `ipv6`, `mac-address`, `us-zip-code`, `vin`. ### Add Watermarks ```bash curl -X POST https://api.nutrient.io/bu