
Baoyu Url To Markdown
Fetch any public URL through a headless Chrome CDP session and save clean markdown for research, competitor briefs, or content pipelines.
Overview
baoyu-url-to-markdown is an agent skill most often used in Idea (also Build docs, Grow content) that captures web pages as markdown via Chrome CDP for research and repurposing.
Install
npx skills add https://github.com/ideacco/baoyu-skills-openclaw --skill baoyu-url-to-markdownWhat is this skill?
- Chrome DevTools Protocol client with timed fetch and session-aware CDP commands
- Dedicated Chrome profile path resolution for repeatable crawls
- Network-idle and CDP connect timeouts tuned for slow SPAs
- Node child-process spawning for browser-driven extraction
- Output suited for downstream markdown-to-HTML or agent summarization
Adoption & trust: 1 installs on skills.sh; 12 GitHub stars; 2/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
You need the full text of a modern web page in markdown but copy-paste and simple HTTP fetches miss client-rendered content.
Who is it for?
Indie builders archiving competitor pages, documentation, or long-form posts before analysis or content reuse.
Skip if: Sites that require authenticated sessions you have not configured in the Chrome profile, or workflows that only need API JSON without page rendering.
When should I use this skill?
You have one or more public URLs and need markdown files produced via Chrome CDP rather than manual copy or plain HTTP.
What do I get? / Deliverables
You get a saved markdown capture of the target URL that agents can summarize, diff, or pass into HTML or publishing workflows.
- Markdown file(s) representing fetched page content
- Optional local browser profile state for repeatable fetches
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Capturing web sources is usually the first step before validating or writing—so the canonical shelf is Idea → research, even though output feeds docs and content later. Competitor pages, docs, and articles are gathered as markdown snapshots during discovery and research, not after the product is built.
Where it fits
Snapshot three competitor pricing pages to markdown before you scope your own tiers.
Capture feature comparison pages for a structured teardown doc.
Pull third-party API reference pages into markdown for an internal integration guide.
Archive a popular blog post as markdown before rewriting it for your newsletter.
How it compares
Browser-backed capture skill—not a static site generator or a generic curl scraper.
Common Questions / FAQ
Who is baoyu-url-to-markdown for?
Solo and indie builders who research on the web and want agent-ready markdown from real browser renders, especially for JS-heavy marketing or docs sites.
When should I use baoyu-url-to-markdown?
During Idea research to snapshot competitors, in Build when ingesting external docs, or in Grow when refreshing source articles—whenever you need durable markdown from a URL.
Is baoyu-url-to-markdown safe to install?
Review the Security Audits panel on this Prism page and treat browser automation as high trust: it can load arbitrary URLs and write files locally.
Workflow Chain
Then invoke: baoyu markdown to html
SKILL.md
READMESKILL.md - Baoyu Url To Markdown
import { spawn, type ChildProcess } from "node:child_process"; import fs from "node:fs"; import { mkdir } from "node:fs/promises"; import net from "node:net"; import process from "node:process"; import { resolveUrlToMarkdownChromeProfileDir } from "./paths.js"; import { CDP_CONNECT_TIMEOUT_MS, NETWORK_IDLE_TIMEOUT_MS } from "./constants.js"; type CdpSendOptions = { sessionId?: string; timeoutMs?: number }; function sleep(ms: number): Promise<void> { return new Promise((resolve) => setTimeout(resolve, ms)); } async function fetchWithTimeout(url: string, init: RequestInit & { timeoutMs?: number } = {}): Promise<Response> { const { timeoutMs, ...rest } = init; if (!timeoutMs || timeoutMs <= 0) return fetch(url, rest); const ctl = new AbortController(); const t = setTimeout(() => ctl.abort(), timeoutMs); try { return await fetch(url, { ...rest, signal: ctl.signal }); } finally { clearTimeout(t); } } export class CdpConnection { private ws: WebSocket; private nextId = 0; private pending = new Map<number, { resolve: (v: unknown) => void; reject: (e: Error) => void; timer: ReturnType<typeof setTimeout> | null }>(); private eventHandlers = new Map<string, Set<(params: unknown) => void>>(); private constructor(ws: WebSocket) { this.ws = ws; this.ws.addEventListener("message", (event) => { try { const data = typeof event.data === "string" ? event.data : new TextDecoder().decode(event.data as ArrayBuffer); const msg = JSON.parse(data) as { id?: number; method?: string; params?: unknown; result?: unknown; error?: { message?: string } }; if (msg.id) { const p = this.pending.get(msg.id); if (p) { this.pending.delete(msg.id); if (p.timer) clearTimeout(p.timer); if (msg.error?.message) p.reject(new Error(msg.error.message)); else p.resolve(msg.result); } } else if (msg.method) { const handlers = this.eventHandlers.get(msg.method); if (handlers) { for (const h of handlers) h(msg.params); } } } catch {} }); this.ws.addEventListener("close", () => { for (const [id, p] of this.pending.entries()) { this.pending.delete(id); if (p.timer) clearTimeout(p.timer); p.reject(new Error("CDP connection closed.")); } }); } static async connect(url: string, timeoutMs: number): Promise<CdpConnection> { const ws = new WebSocket(url); await new Promise<void>((resolve, reject) => { const t = setTimeout(() => reject(new Error("CDP connection timeout.")), timeoutMs); ws.addEventListener("open", () => { clearTimeout(t); resolve(); }); ws.addEventListener("error", () => { clearTimeout(t); reject(new Error("CDP connection failed.")); }); }); return new CdpConnection(ws); } on(event: string, handler: (params: unknown) => void): void { let handlers = this.eventHandlers.get(event); if (!handlers) { handlers = new Set(); this.eventHandlers.set(event, handlers); } handlers.add(handler); } off(event: string, handler: (params: unknown) => void): void { this.eventHandlers.get(event)?.delete(handler); } async send<T = unknown>(method: string, params?: Record<string, unknown>, opts?: CdpSendOptions): Promise<T> { const id = ++this.nextId; const msg: Record<string, unknown> = { id, method }; if (params) msg.params = params; if (opts?.sessionId) msg.sessionId = opts.sessionId; const timeoutMs = opts?.timeoutMs ?? 15_000; const out = await new Promise<unknown>((resolve, reject) => { const t = timeoutMs > 0 ? setTimeout(() => { this.pending.delete(id); reject(new Error(`CDP timeout: ${method}`)); }, timeoutMs) : null; this.pending.set(id, { resolve, reject, timer: t }); this.ws.send(JSON.stringify(msg)); }); return out as T; } close(): void { try { this.ws.close(); } catch {}