Plugin · Claude Code · Testing

Shinpr Rashomon

shinpr-rashomon is a Claude Code plugin for the Ship phase that measures prompt and skill improvements using blind A/B comparison and parallel worktree runs.

by shinpr · github.com/shinpr/rashomon

Run blind A/B comparisons on prompts and agent skills so you can prove which version actually performs better before you ship it.

9
GitHub stars
0
Installs
0
Community votes
One vote per signed-in builder - it helps surface the tools the community actually relies on.
Install

Add it to Claude Code

Install the plugin in Claude Code. One command, paste-ready.

Install the plugin
/plugin install shinpr-rashomon@shinpr/rashomon
Add to ClaudeUse the Agent APISkillselion is itself an MCP server - your agent can fetch this config directly.
Agent API

Built to be called by your agent

Skillselion is itself an MCP server. Your agent can pull this entry and a paste-ready install config straight from the API - no copy-paste.

Retrieve this entry with skillselion.get_details("plugin:shinpr/rashomon") and the paste-ready config with skillselion.get_install_config("plugin:shinpr/rashomon").

About

What it does

shinpr-rashomon is a Claude Code plugin for solo builders and small teams who treat agent skills and prompts like code that needs regression checks. Instead of guessing which rewrite is better, you run structured comparisons—often in parallel via worktrees—so two variants solve the same task under the same constraints. That fits naturally after you draft a skill in Build, when you are validating a prototype, and again before you rely on a prompt in daily Ship workflows. The plugin emphasizes measurement over opinion: evaluate execution quality, compare outcomes, and keep what wins. It is not a marketplace of skills or an OpenAPI generator; it is an evaluation harness focused on prompts, skills, and repeatable agent work. Install it when you are iterating on SKILL.md files, system prompts, or multi-step agent behaviors and need evidence before you standardize on one version.

Highlights

  • Blind A/B comparison for prompts and skills so bias does not skew results
  • Parallel execution through worktrees for isolated variants
  • Evaluate skill and prompt improvements with comparable runs
  • Built for iteration on agent workflows, not single-shot codegen
  • Community plugin bundle (1 plugin) from shinpr/rashomon

Why builders use it

You cannot tell whether a new prompt or skill is actually better because you only try one version at a time and judge by gut feel.

You pick winning prompts and skills from blind, parallel comparisons with comparable execution evidence instead of anecdotal preference.

At a glance

  • Type - Plugin in Testing.
  • Adoption - 0 installs, 9 stars, 0 votes.

FAQ

Who is shinpr-rashomon for?

Solo and indie builders using Claude Code who maintain custom skills or prompts and want measurable A/B comparisons before standardizing on one version.

When should I use shinpr-rashomon?

Use it whenever you change a skill or prompt and need blind side-by-side runs—especially before ship, after prototyping two approaches, or when iterating agent tooling.

How do I add shinpr-rashomon to my agent?

Install the shinpr/rashomon Claude Code plugin from the community listing, enable it in your Claude Code plugins configuration, and invoke its comparison workflow when you have two variants to test in parallel worktrees.

Discussion

Comments

Share how you use shinpr-rashomon, gotchas, or tips for other indie builders.

No comments yet - be the first to share how you use it.

This week for builders

Five minutes, every Monday — the tools, releases and tactics for shipping solo.

unsubscribe anytime.