
Codabench Mcp
Let your coding agent drive Codabench competitions—browse benchmarks, submit runs, and track participant workflows without hand-calling the REST API.
Overview
Codabench MCP is a Build-phase MCP server that exposes the Codabench REST API so agents can run a full ML-benchmark participant workflow.
What is this MCP server?
- Stdio MCP server (PyPI codabench-mcp v0.1.1) over the Codabench REST API
- End-to-end ML-benchmark participant workflow from the agent (not one-off HTTP snippets)
- Authenticated calls via required CODABENCH_API_TOKEN (DRF token from Codabench API docs)
- Pairs with Claude Code, Cursor, and other MCP-capable agents for repeatable benchmark ops
- Server version 0.1.1 on PyPI identifier codabench-mcp
- Stdio transport; CODABENCH_API_TOKEN required
- Targets full ML-benchmark participant workflow via Codabench REST API
What problem does it solve?
Driving Codabench submissions and status checks from an agent means copying REST endpoints and tokens into ad-hoc scripts that break whenever the competition flow changes.
Who is it for?
Solo ML builders and indie teams who compete on Codabench and want Claude Code or Cursor to manage benchmark participation alongside their repo.
Skip if: Teams that do not use Codabench, or builders who only need static dataset downloads without live competition APIs.
What do I get? / Deliverables
After you register the server with CODABENCH_API_TOKEN, your agent can call Codabench through MCP tools for a consistent participant workflow instead of one-off HTTP calls.
- MCP tools backed by Codabench REST for participant-oriented actions
- Repeatable agent-driven benchmark workflow without custom HTTP scripts
- Authenticated stdio MCP session using your Codabench token
Recommended MCP Servers
Journey fit
ML benchmark participation is part of building and validating model pipelines, so the canonical shelf is Build where external APIs are wired into the product or eval harness. Codabench is a third-party REST integration; Integrations is the right subphase for MCP tools that proxy a specific platform API.
How it compares
Codabench REST integration via MCP, not a local training or hyperparameter-tuning skill.
Common Questions / FAQ
Who is Codabench MCP for?
It is for developers and ML hobbyists who use Codabench for benchmarks and want their AI coding agent to handle participant API workflows through MCP.
When should I use Codabench MCP?
Use it when you are building or operating an ML eval loop on Codabench and need submit, poll, and workflow actions from Claude Code, Cursor, or another MCP client.
How do I add Codabench MCP to my agent?
Install codabench-mcp from PyPI, obtain a DRF token via Codabench API docs (api-token-auth), set CODABENCH_API_TOKEN, and add the stdio server entry to your agent’s MCP config.