
Portable Text Conversion
Migrate legacy HTML from an old CMS into Sanity Portable Text blocks using htmlToBlocks and a compiled block schema.
Overview
portable-text-conversion is an agent skill for the Build phase that converts HTML into Sanity Portable Text blocks using @portabletext/block-tools and a compiled schema.
Install
npx skills add https://github.com/sanity-io/agent-toolkit --skill portable-text-conversionWhat is this skill?
- htmlToBlocks from @portabletext/block-tools with JSDOM parseHtml for Node.js
- Schema.compile via @sanity/schema so only valid marks, styles, and custom block types are emitted
- Built-in handling patterns for Google Docs, Microsoft Word, and Notion HTML exports
- Explicit guidance to use @portabletext/markdown for Markdown sources instead of block-tools
- Legacy @sanity/block-tools noted—new projects should use @portabletext/block-tools with identical API
Adoption & trust: 819 installs on skills.sh; 150 GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You are importing HTML from a legacy CMS but Sanity needs Portable Text blocks that match your compiled block content schema.
Who is it for?
Solo builders on Sanity migrating HTML archives, Word/Docs/Notion exports, or one-off rich-text imports during a replatform.
Skip if: Greenfield Markdown-only content (use markdown-to-pt), non-Sanity CMS targets, or projects that will keep storing HTML without a PT model.
When should I use this skill?
When migrating HTML content from legacy CMSs into Sanity using @portabletext/block-tools and htmlToBlocks.
What do I get? / Deliverables
You get a repeatable Node pipeline that parses HTML with JSDOM and outputs valid Portable Text blocks ready for Sanity import or mutations.
- Import script using htmlToBlocks and Schema.compile
- Portable Text JSON aligned to your block types
- Migration notes for Markdown vs HTML source choice
Recommended Skills
Journey fit
Content-model migration and import pipelines happen while building the Sanity-backed product, before ship-ready editorial flows. Backend fits schema compilation, Node import scripts, and block-tools conversion rather than pixel-level UI work.
How it compares
Use for HTML legacy migration; use @portabletext/markdown when the source is Markdown—not a generic HTML sanitizer skill.
Common Questions / FAQ
Who is portable-text-conversion for?
Developers and solo founders using Sanity who need agent-guided HTML-to-Portable-Text migration with schema-aware block-tools, typically during a CMS cutover.
When should I use portable-text-conversion?
Use it in Build when importing HTML from legacy systems, cleaning pasted rich text, or scripting bulk document creation before editors work in Studio.
Is portable-text-conversion safe to install?
It is documentation and npm dependencies for local conversion scripts; review the Security Audits panel on this Prism page and run imports against copies of production data first.
SKILL.md
READMESKILL.md - Portable Text Conversion
# Convert HTML to Portable Text Use `@portabletext/block-tools` to parse HTML into Portable Text blocks. This is the primary tool for migrating HTML content from legacy CMSs. It has built-in support for content from Google Docs, Microsoft Word, and Notion. > **Note:** For Markdown sources, use `@portabletext/markdown` instead — it's simpler and more direct. See `rules/markdown-to-pt.md`. > **Note:** `@sanity/block-tools` is the legacy package name. Use `@portabletext/block-tools` for new projects. The API is identical. ## Setup ```bash npm install @portabletext/block-tools jsdom @sanity/schema ``` In Node.js, you must provide a `parseHtml` function that returns a DOM `Document`. Use JSDOM for this: ```ts import {htmlToBlocks} from '@portabletext/block-tools' import {JSDOM} from 'jsdom' import Schema from '@sanity/schema' // JSDOM is passed to htmlToBlocks via the parseHtml option: // htmlToBlocks(html, blockContentType, { // parseHtml: (html) => new JSDOM(html).window.document, // }) ``` ## Define Your Schema `htmlToBlocks` needs a compiled Sanity block content type to know which marks, styles, and custom types are valid. Use `@sanity/schema` to compile it: ```ts const defaultSchema = Schema.compile({ name: 'mySchema', types: [ { name: 'post', type: 'document', fields: [ { name: 'body', type: 'array', of: [ { type: 'block', marks: { decorators: [ {title: 'Strong', value: 'strong'}, {title: 'Emphasis', value: 'em'}, {title: 'Code', value: 'code'}, ], annotations: [ { name: 'link', type: 'object', fields: [{name: 'href', type: 'url'}], }, ], }, styles: [ {title: 'Normal', value: 'normal'}, {title: 'H2', value: 'h2'}, {title: 'H3', value: 'h3'}, {title: 'Quote', value: 'blockquote'}, ], lists: [ {title: 'Bullet', value: 'bullet'}, {title: 'Number', value: 'number'}, ], }, { name: 'image', type: 'image', fields: [{name: 'alt', type: 'string'}], }, ], }, ], }, ], }) const blockContentType = defaultSchema .get('post') .fields.find((f) => f.name === 'body').type ``` ## Basic Conversion ```ts const html = '<p>Hello <strong>world</strong></p><h2>Heading</h2>' const blocks = htmlToBlocks(html, blockContentType, { parseHtml: (html) => new JSDOM(html).window.document, }) ``` ## Custom Deserializers Handle HTML elements that don't map directly to standard PT: ```ts const blocks = htmlToBlocks(html, blockContentType, { parseHtml: (html) => new JSDOM(html).window.document, rules: [ // Convert <img> to image blocks { deserialize(el, next, block) { if (el.tagName?.toLowerCase() !== 'img') return undefined return block({ _type: 'image', asset: { _type: 'reference', _ref: '', // Upload image separately, set ref after }, alt: el.getAttribute('alt') || '', _sanityAsset: `image@${el.getAttribute('src')}`, // for migration tooling }) }, }, // Convert <a> with custom attributes { deserialize(el, next, block) { if (el.tagName?.toLowerCase() !== 'a') return undefined const href = el.getAttribute('href') || '' const target = el.getAttribute('target') || '' return