Vlm Run Skills
vlm-run-skills is a Claude Code plugin for the Build phase that connects your agent to the VLM Run CLI for document, image, and video understanding and extraction.
Wire Claude Code to the VLM Run CLI so agents can extract, summarize, and reason over images, documents, and videos in one command flow.
Add it to Claude Code
Install the plugin in Claude Code. One command, paste-ready.
/plugin install vlm-run-skills@vlm-run/skillsBuilt to be called by your agent
Skillselion is itself an MCP server. Your agent can pull this entry and a paste-ready install config straight from the API - no copy-paste.
Retrieve this entry with skillselion.get_details("plugin:vlm-run/skills") and the paste-ready config with skillselion.get_install_config("plugin:vlm-run/skills").
What it does
vlm-run-skills is a Claude Code plugin that teaches your agent how to drive the VLM Run CLI for vision-language workloads. Solo builders who ship agents or internal tools install it when they need structured extraction, summarization, or understanding across images, PDFs, and videos without hand-rolling API glue each time. The skill sits in the build phase as an integration layer: you authenticate and run the CLI, then let Claude orchestrate prompts and outputs in your repo workflow. It fits indie teams automating document intake, media QA, or research pipelines where a general LLM is not enough and a visual model is required. Expect intermediate setup—API or CLI credentials and clear task boundaries—rather than a one-click no-code app. It is a task integration for multimodal agents, not a full observability or deployment stack.
Highlights
- Claude skill aligned with the VLM Run CLI for visual and document workflows
- Supports image, document, and video understanding, summarization, and object-style detection tasks
- Natural-language driven processing for extraction and generation over visual media
- Orion/VLM Run-oriented keywords for document and multimodal agent automation
- Single-plugin bundle focused on agent + CLI multimodal execution
Why builders use it
Agents are strong on text but awkward at repeatable visual and document pipelines without a dedicated VLM CLI skill.
After install, Claude can invoke VLM Run for multimodal extraction and summarization from your terminal-driven agent workflow.
At a glance
- Type - Plugin in LLM Integration.
- Adoption - 0 installs, 7 stars, 0 votes.
FAQ
Who is vlm-run-skills for?
Solo and small-team developers using Claude Code who want vision-language document and media tasks via the VLM Run CLI.
When should I use vlm-run-skills?
Use it during Build when you are integrating multimodal extraction, summarization, or detection into an agent or CLI workflow.
How do I add vlm-run-skills to my agent?
Register the vlm-run/skills plugin in Claude Code, configure VLM Run CLI access, then invoke the bundled skill in sessions that need visual or document processing.
Comments
Share how you use vlm-run-skills, gotchas, or tips for other indie builders.
No comments yet - be the first to share how you use it.