Plugin · Claude Code · Development Tools

Judgmentlabs Judgeval Claude Plugin

judgmentlabs-judgeval-claude-plugin is a Claude Code plugin for the Ship phase that enables Judgeval tracing, logging, and evaluation of assistant conversations and tool calls.

by JudgmentLabs · github.com/JudgmentLabs/judgeval-claude-plugin

Trace and evaluate Claude Code assistant calls with Judgeval logging so you can score conversations, usage, and correctness before shipping agent features.

0
GitHub stars
0
Installs
0
Community votes
One vote per signed-in builder - it helps surface the tools the community actually relies on.
Install

Add it to Claude Code

Install the plugin in Claude Code. One command, paste-ready.

Install the plugin
/plugin install judgmentlabs-judgeval-claude-plugin@JudgmentLabs/judgeval-claude-plugin
Add to ClaudeUse the Agent APISkillselion is itself an MCP server - your agent can fetch this config directly.
Agent API

Built to be called by your agent

Skillselion is itself an MCP server. Your agent can pull this entry and a paste-ready install config straight from the API - no copy-paste.

Retrieve this entry with skillselion.get_details("plugin:JudgmentLabs/judgeval-claude-plugin") and the paste-ready config with skillselion.get_install_config("plugin:JudgmentLabs/judgeval-claude-plugin").

About

What it does

judgmentlabs-judgeval-claude-plugin is the official Claude Code CLI integration for Judgeval tracing and evaluation. Indie builders shipping agent-heavy products install it when they need automatic capture of assistant calls, messages, and responses instead of ad-hoc copy-paste logs. The marketplace listing spans two plugins with keywords for agents, observability, evaluation, logging, trace usage, scripts, and working-session capture—positioning it squarely in Ship testing with carryover into Operate monitoring when you watch quality drift. Use it while hardening skills, running evaluation examples, or proving correctness before you trust an agent path in production. Intermediate complexity reflects Judgeval account setup, API keys, and aligning trace semantics with your eval rubric. It complements agent skills rather than replacing them: you still author behavior in skills, but Judgeval supplies the measurement layer Claude Code lacks natively. Not a crash reporter or SEO tool—a focused eval and observability plugin for Claude Code sessions.

Highlights

  • Claude Code CLI plugin bundle (pluginCount: 2) for Judgeval tracing and observability
  • Automatically captures assistant calls, messages, and responses for evaluation workflows
  • Enables logging, trace usage, and correctness checks with helper scripts and examples
  • Targets agents, evaluation, and observability—not generic app unit tests
  • Works as a bridge between local Claude Code conversations and Judgeval evaluation tooling

Why builders use it

You cannot reliably score or debug Claude Code agent sessions when conversations, calls, and responses are not automatically traced and logged for evaluation.

After install, Claude Code sessions feed Judgeval traces and eval hooks so you can measure usage, run examples, and check assistant correctness before release.

At a glance

  • Type - Plugin in Development Tools.
  • Adoption - 0 installs, 0 stars, 0 votes.

FAQ

Who is judgmentlabs-judgeval-claude-plugin for?

Developers using Claude Code who want Judgeval-backed tracing, logging, and evaluation on assistant messages, calls, and responses.

When should I use judgmentlabs-judgeval-claude-plugin?

Use it during Ship testing (and ongoing Operate monitoring) when you need automatic trace capture and eval examples before trusting agent workflows.

How do I add judgmentlabs-judgeval-claude-plugin to my agent?

Install the JudgmentLabs/judgeval-claude-plugin marketplace bundle in Claude Code, configure Judgeval credentials per the plugin README, and enable the tracing plugins so sessions log automatically.

Discussion

Comments

Share how you use judgmentlabs-judgeval-claude-plugin, gotchas, or tips for other indie builders.

No comments yet - be the first to share how you use it.

This week for builders

Five minutes, every Monday — the tools, releases and tactics for shipping solo.

unsubscribe anytime.