Vlm Run Skills

Name: Vlm Run Skills
Author: vlm-run

vlm-run/skills

15 repo stars
Updated April 16, 2026
vlm-run/skills

vlm-run-skills is a Claude Code plugin that connects your agent to the VLM Run CLI for document, image, and video understanding and extraction.

About

vlm-run-skills is a Claude Code plugin that teaches your agent how to drive the VLM Run CLI for vision-language workloads. developers who ship agents or internal tools install it when they need structured extraction, summarization, or understanding across images, PDFs, and videos without hand-rolling API glue each time. The skill sits in the build phase as an integration layer: you authenticate and run the CLI, then let Claude orchestrate prompts and outputs in your repo workflow. It fits teams automating document intake, media QA, or research pipelines where a general LLM is not enough and a visual model is required. Expect intermediate setup—API or CLI credentials and clear task boundaries—rather than a one-click no-code app. It is a task integration for multimodal agents, not a full observability or deployment stack.

Claude skill aligned with the VLM Run CLI for visual and document workflows
Supports image, document, and video understanding, summarization, and object-style detection tasks
Natural-language driven processing for extraction and generation over visual media
Orion/VLM Run-oriented keywords for document and multimodal agent automation
Single-plugin bundle focused on agent + CLI multimodal execution

Vlm Run Skills by the numbers

Data as of Jul 7, 2026 (Skillselion catalog sync)

/plugin install vlm-run-skills@vlm-run/skills

Add your badge

Show developers this plugin is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/plugin/vlm-run/skills.svg)](https://skillselion.com/plugin/vlm-run/skills)

repo stars	★ 15
Last updated	April 16, 2026
Repository	vlm-run/skills ↗

What it does

Wire Claude Code to the VLM Run CLI so agents can extract, summarize, and reason over images, documents, and videos in one command flow.

Who is it for?

Best when you're building document or media automation with Claude Code and already use or plan to use VLM Run.

Skip if: Skip if you only need plain text chat with no images, PDFs, or video processing.

What you get

After install, Claude can invoke VLM Run for multimodal extraction and summarization from your terminal-driven agent workflow.

Agent-invokable VLM Run CLI workflows for multimodal tasks
Summaries, extractions, or structured outputs from visual and document inputs
Repeatable terminal-driven VLM steps inside your coding session

Recommended Plugins

Abix5 Memory HindsightHindsight memory bank integration for Claude Code

Agentic Dev Io Mcp Code ExecutionToken-efficient MCP server interaction through code execution.1

Agneym Agneym Claude MarketplaceClaude Code plugin marketplace for Agney's plugins

Akkomar Mozdata Claude Plugin1 plugin

Amitpatole Claude Genkit PluginFirebase Genkit Plugin for Claude Code12

Aserper RtfdSpoonfeed your AI coding assistant with up to date documentation efficiently and without using a cloud service with the RTFD mcp server that runs on YOUR machine without any API key requirements!14

How it compares

VLM CLI agent skill, not a hosted no-code document SaaS or generic MCP-only file browser.

FAQ

Who is Vlm Run Skills for?

and small-team developers using Claude Code who want vision-language document and media tasks via the VLM Run CLI.

When should I use Vlm Run Skills?

Use it during Build when you are integrating multimodal extraction, summarization, or detection into an agent or CLI workflow.

How do I add Vlm Run Skills to my agent?

Register the vlm-run/skills plugin in Claude Code, configure VLM Run CLI access, then invoke the bundled skill in sessions that need visual or document processing.

LLM Integrationautomationllm