
Ragmetric Mcp
Measure whether your RAG pipeline actually retrieves the right chunks before you ship or iterate on embeddings and chunking.
Overview
Ragmetric MCP is a MCP server for the Ship phase that scores RAG retrieval with recall@k, hit@k, MRR, NDCG@k, and batch evaluation.
What is this MCP server?
- Computes recall@k, hit@k, MRR, and NDCG@k for ranked retrieval lists
- evaluate_batch for scoring many queries in one agent turn
- stdio npm package @mukundakatta/ragmetric-mcp v0.1.1
- Fits RAG eval loops in Claude Code without leaving the IDE
- No external API key in server.json—local stdio MCP
- Server version 0.1.1
- Tools include recall@k, hit@k, MRR, NDCG@k, and evaluate_batch
- stdio transport via npm registry
What problem does it solve?
You cannot tell if a RAG change helped because you only eyeball a few chat replies instead of measuring retrieval.
Who is it for?
Solo builders with labeled Q&A–document pairs who iterate on vector search before shipping an AI feature.
Skip if: Teams that need full answer correctness, latency SLAs, or production observability without labeled retrieval sets.
What do I get? / Deliverables
You get standard IR numbers and batch runs you can compare across embedding or chunking experiments.
- Per-query and aggregate recall@k, hit@k, MRR, and NDCG@k
- Batch evaluation output for regression comparisons
Recommended MCP Servers
Journey fit
How it compares
Retrieval IR metrics MCP, not an agent skill for writing prompts or building indexes.
Common Questions / FAQ
Who is ragmetric-mcp for?
Indie and solo developers building RAG who want recall@k, MRR, and NDCG on a test set inside their coding agent.
When should I use ragmetric-mcp?
When you change chunking, embeddings, or reranking and need comparable retrieval scores before shipping.
How do I add ragmetric-mcp to my agent?
Install @mukundakatta/ragmetric-mcp from npm, add a stdio MCP entry pointing at that package, and restart Claude Code or Cursor.