
Velocirag
Attach a low-latency local RAG MCP server so agents query your knowledge base with 4-layer fusion and ONNX Runtime instead of slow remote-only retrieval.
Overview
VelociRAG is a MCP server for the build phase that delivers agent-oriented RAG with 4-layer fusion and ONNX Runtime targeting sub-200ms search.
What is this MCP server?
- Lightning-fast RAG positioning with sub-200ms search in product description
- 4-layer fusion retrieval stack tuned for agent workflows
- ONNX Runtime execution for efficient local inference
- Stdio PyPI package velocirag (registry version 0.7.4)
- GitHub source at HaseebKhalid1507/VelociRAG for self-hosted agent memory
- Transport: stdio via PyPI registryType velocirag
- Product claims: 4-layer fusion, ONNX Runtime, sub-200ms search
Community signal: 10 GitHub stars.
What problem does it solve?
Agent features feel sluggish when every tool call waits on remote vector APIs or heavyweight embedding pipelines.
Who is it for?
Indie builders shipping agent copilots who want self-hosted, ONNX-backed RAG with MCP tool access on their dev machine or small server.
Skip if: Teams needing managed enterprise vector SaaS only, or workflows with no local Python runtime.
What do I get? / Deliverables
After installing velocirag from PyPI and registering stdio MCP, your agent gets fused local retrieval tuned for fast search turns.
- MCP tools for fused RAG queries over your indexed content
- ONNX-accelerated retrieval suitable for tight agent tool loops
- Local control plane for agent knowledge without mandatory cloud vector DB
Recommended MCP Servers
Journey fit
How it compares
Local fused RAG MCP over stdio, not a generic web-search browser MCP or a single prompt-only Claude skill.
Common Questions / FAQ
Who is VelociRAG for?
AI-coding agent builders who need fast, local RAG tool calls with ONNX Runtime and multi-layer fusion.
When should I use VelociRAG?
Use it while building agent integrations and grounding layers, and when ship-phase testing must measure retrieval latency under real MCP loads.
How do I add VelociRAG to my agent?
Install PyPI package velocirag 0.7.4, configure your MCP client stdio command to launch the server, index your corpus per VelociRAG docs, then restart the agent.