Plugin · Claude Code · LLM Integration

Vlm Run Skills

vlm-run-skills is a Claude Code plugin for the Build phase that connects your agent to the VLM Run CLI for document, image, and video understanding and extraction.

by vlm-run · github.com/vlm-run/skills

Wire Claude Code to the VLM Run CLI so agents can extract, summarize, and reason over images, documents, and videos in one command flow.

7
GitHub stars
0
Installs
0
Community votes
One vote per signed-in builder - it helps surface the tools the community actually relies on.
Install

Add it to Claude Code

Install the plugin in Claude Code. One command, paste-ready.

Install the plugin
/plugin install vlm-run-skills@vlm-run/skills
Add to ClaudeUse the Agent APISkillselion is itself an MCP server - your agent can fetch this config directly.
Agent API

Built to be called by your agent

Skillselion is itself an MCP server. Your agent can pull this entry and a paste-ready install config straight from the API - no copy-paste.

Retrieve this entry with skillselion.get_details("plugin:vlm-run/skills") and the paste-ready config with skillselion.get_install_config("plugin:vlm-run/skills").

About

What it does

vlm-run-skills is a Claude Code plugin that teaches your agent how to drive the VLM Run CLI for vision-language workloads. Solo builders who ship agents or internal tools install it when they need structured extraction, summarization, or understanding across images, PDFs, and videos without hand-rolling API glue each time. The skill sits in the build phase as an integration layer: you authenticate and run the CLI, then let Claude orchestrate prompts and outputs in your repo workflow. It fits indie teams automating document intake, media QA, or research pipelines where a general LLM is not enough and a visual model is required. Expect intermediate setup—API or CLI credentials and clear task boundaries—rather than a one-click no-code app. It is a task integration for multimodal agents, not a full observability or deployment stack.

Highlights

  • Claude skill aligned with the VLM Run CLI for visual and document workflows
  • Supports image, document, and video understanding, summarization, and object-style detection tasks
  • Natural-language driven processing for extraction and generation over visual media
  • Orion/VLM Run-oriented keywords for document and multimodal agent automation
  • Single-plugin bundle focused on agent + CLI multimodal execution

Why builders use it

Agents are strong on text but awkward at repeatable visual and document pipelines without a dedicated VLM CLI skill.

After install, Claude can invoke VLM Run for multimodal extraction and summarization from your terminal-driven agent workflow.

At a glance

  • Type - Plugin in LLM Integration.
  • Adoption - 0 installs, 7 stars, 0 votes.

FAQ

Who is vlm-run-skills for?

Solo and small-team developers using Claude Code who want vision-language document and media tasks via the VLM Run CLI.

When should I use vlm-run-skills?

Use it during Build when you are integrating multimodal extraction, summarization, or detection into an agent or CLI workflow.

How do I add vlm-run-skills to my agent?

Register the vlm-run/skills plugin in Claude Code, configure VLM Run CLI access, then invoke the bundled skill in sessions that need visual or document processing.

Discussion

Comments

Share how you use vlm-run-skills, gotchas, or tips for other indie builders.

No comments yet - be the first to share how you use it.

This week for builders

Five minutes, every Monday — the tools, releases and tactics for shipping solo.

unsubscribe anytime.