
Doc Importer
Turn PDF, Office, or HTML files into editable markdown for your repo docs or agent rewriting workflows.
Install
npx skills add https://github.com/athola/claude-night-market --skill doc-importerWhat is this skill?
- Imports DOCX, PPTX, XLSX, PDF, and HTML into markdown via leyline:document-conversion
- Two-step workflow: identify source, then convert with markitdown MCP preferred and native fallbacks
- Chains scribe:slop-detector and scribe:doc-generator for downstream cleanup and authoring
- Explicit routing: not for academic papers (tome:papers) or web knowledge intake (memory-palace)
Adoption & trust: 1 installs on skills.sh; 304 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
Recommended Skills
Lark Doclarksuite/cli
Lark Wikilarksuite/cli
Opensource Guide Coachxixu-me/skills
Readme I18nxixu-me/skills
Doc Coauthoringanthropics/skills
Obsidian Markdownkepano/obsidian-skills
Journey fit
Common Questions / FAQ
Is Doc Importer safe to install?
skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Doc Importer
# Document Importer Import external documents into editable markdown. ## When To Use - User provides a DOCX, PPTX, XLSX, PDF, or HTML file to convert into project documentation - User wants to extract content from a document for rewriting or remediation - User has a slide deck or spreadsheet to turn into markdown documentation ## When NOT To Use - Academic paper analysis: use `tome:papers` - Web article knowledge intake: use `memory-palace:knowledge-intake` - Content already in markdown: use `scribe:doc-generator` remediation mode directly ## Import Workflow ### Step 1: Identify Source Determine the source document: - **Local file path**: verify it exists with Read tool - **URL**: verify accessibility - **User description**: confirm format and location ### Step 2: Convert to Markdown Apply the `leyline:document-conversion` protocol: 1. Construct URI from source (file path or URL) 2. Try the markitdown MCP tool for best quality 3. If unavailable, use native tool fallbacks 4. If format unsupported, inform user ### Step 3: Structural Cleanup After conversion, normalize the markdown: - Ensure ATX headings (`# style`, not setext underlines) - Wrap prose lines at 80 characters per `leyline:markdown-formatting` - Fix broken tables (align columns, add headers) - Remove conversion artifacts (page numbers, headers/footers, watermarks, repeated logos) - Preserve all substantive content ### Step 4: Sanitize External Content Apply the `leyline:content-sanitization` checklist: - Size check (truncate sections over 2000 words) - Strip system/instruction tags - Wrap in external content boundary markers ### Step 5: Write Draft Write the converted markdown to the target location. Default: same directory as source, with `.md` extension. Ask the user for target path if ambiguous. ### Step 6: Hand Off to Doc-Generator (Optional) If the user wants polishing or rewriting: - Invoke `Skill(scribe:doc-generator)` in Remediation mode on the imported file - The doc-generator handles slop detection, style application, and quality gates Offer this step; do not assume the user wants remediation. ## Output Quality The imported markdown should: - Have a top-level `# Title` from the document title - Preserve the original heading hierarchy - Convert tables to markdown tables - Convert images to `` references (note: image files may need separate handling) - Convert lists faithfully - Mark unclear or garbled sections with `<!-- REVIEW: conversion artifact -->` ## Exit Criteria - Source document identified and accessible - Conversion attempted via document-conversion protocol - Structural cleanup applied - Sanitization checklist passed - Draft written to target path - User informed of any conversion limitations