
Skill Creator
Improve agent skills after blind A/B runs by analyzing why the winner won and what to change in the loser’s SKILL.md and tooling.
Overview
Skill-creator’s post-hoc analyzer is an agent skill most often used in Build (also Ship review) that explains why a blind skill comparison winner won and how to improve the loser’s SKILL.md and execution behavior.
Install
npx skills add https://github.com/anthropics/claude-plugins-official --skill skill-creatorWhat is this skill?
- Unblinds blind-comparator JSON to explain winner reasoning and scores
- Compares winner vs loser SKILL.md structure, scripts, examples, and edge cases
- Reads both execution transcripts to tie outcomes to instructions clarity
- Outputs actionable improvement suggestions to a specified analysis path
- Designed for the skill-creator eval loop after parallel skill runs
- 5 input path parameters for winner, loser, comparison JSON, and output
Adoption & trust: 3.4k installs on skills.sh; 29.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You finished a blind skill comparison but only know A won—not which instructions, scripts, or gaps caused the loss.
Who is it for?
Builders maintaining SKILL.md packages who already run blind comparisons and want transcript-grounded improvement notes.
Skip if: Greenfield feature work without eval artifacts, or teams that skip documented winner/loser paths and comparison JSON.
When should I use this skill?
After a blind comparator declares A or B the winner and you have paths to both skills, transcripts, and comparison_result_path.
What do I get? / Deliverables
You get a structured post-hoc analysis saved to your output path, with skill-level diffs and transcript-backed fixes ready for the next skill-creator iteration.
- Post-hoc analysis document at output_path
- Actionable improvement list tied to comparator reasoning
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Skill authoring and packaging live in Build; this skill is the canonical shelf for creators refining agent capabilities. Post-hoc analysis sits next to skill creation, eval harnesses, and procedural SKILL.md design under agent-tooling.
Where it fits
After two SKILL.md variants run blind, unblind results to patch the loser before publishing to Prism.
Before tagging a catalog skill release, verify comparator-winning instructions map to real transcript behavior.
When user reports regressions on a skill version, compare old vs new packages using the same post-hoc steps.
How it compares
Use after a blind comparator pass—not as a substitute for ad-hoc “make my skill better” chat without run artifacts.
Common Questions / FAQ
Who is skill-creator post-hoc analysis for?
Solo and indie agent-skill authors who benchmark skills with blind comparisons and need actionable diffs between winning and losing packages.
When should I use skill-creator post-hoc analysis?
Use it in Build when refining agent-tooling after evals, and in Ship review when validating that procedural changes actually fix comparator gaps before you republish.
Is skill-creator safe to install?
It reads local comparison JSON, SKILL.md files, and transcripts you point at—review the Security Audits panel on this Prism page before installing from the official Anthropic plugins repo.
SKILL.md
READMESKILL.md - Skill Creator
# Post-hoc Analyzer Agent Analyze blind comparison results to understand WHY the winner won and generate improvement suggestions. ## Role After the blind comparator determines a winner, the Post-hoc Analyzer "unblids" the results by examining the skills and transcripts. The goal is to extract actionable insights: what made the winner better, and how can the loser be improved? ## Inputs You receive these parameters in your prompt: - **winner**: "A" or "B" (from blind comparison) - **winner_skill_path**: Path to the skill that produced the winning output - **winner_transcript_path**: Path to the execution transcript for the winner - **loser_skill_path**: Path to the skill that produced the losing output - **loser_transcript_path**: Path to the execution transcript for the loser - **comparison_result_path**: Path to the blind comparator's output JSON - **output_path**: Where to save the analysis results ## Process ### Step 1: Read Comparison Result 1. Read the blind comparator's output at comparison_result_path 2. Note the winning side (A or B), the reasoning, and any scores 3. Understand what the comparator valued in the winning output ### Step 2: Read Both Skills 1. Read the winner skill's SKILL.md and key referenced files 2. Read the loser skill's SKILL.md and key referenced files 3. Identify structural differences: - Instructions clarity and specificity - Script/tool usage patterns - Example coverage - Edge case handling ### Step 3: Read Both Transcripts 1. Read the winner's transcript 2. Read the loser's transcript 3. Compare execution patterns: - How closely did each follow their skill's instructions? - What tools were used differently? - Where did the loser diverge from optimal behavior? - Did either encounter errors or make recovery attempts? ### Step 4: Analyze Instruction Following For each transcript, evaluate: - Did the agent follow the skill's explicit instructions? - Did the agent use the skill's provided tools/scripts? - Were there missed opportunities to leverage skill content? - Did the agent add unnecessary steps not in the skill? Score instruction following 1-10 and note specific issues. ### Step 5: Identify Winner Strengths Determine what made the winner better: - Clearer instructions that led to better behavior? - Better scripts/tools that produced better output? - More comprehensive examples that guided edge cases? - Better error handling guidance? Be specific. Quote from skills/transcripts where relevant. ### Step 6: Identify Loser Weaknesses Determine what held the loser back: - Ambiguous instructions that led to suboptimal choices? - Missing tools/scripts that forced workarounds? - Gaps in edge case coverage? - Poor error handling that caused failures? ### Step 7: Generate Improvement Suggestions Based on the analysis, produce actionable suggestions for improving the loser skill: - Specific instruction changes to make - Tools/scripts to add or modify - Examples to include - Edge cases to address Prioritize by impact. Focus on changes that would have changed the outcome. ### Step 8: Write Analysis Results Save structured analysis to `{output_path}`. ## Output Format Write a JSON file with this structure: ```json { "comparison_summary": { "winner": "A", "winner_skill": "path/to/winner/skill", "loser_skill": "path/to/loser/skill", "comparator_reasoning": "Brief summary of why comparator chose winner" }, "winner_strengths": [ "Clear step-by-step instructions for handling multi-page documents", "Included validation script that caught formatting errors", "Explicit guidance on fallback behavior when OCR fails" ], "loser_weaknesses": [ "Vague instruction 'process the document appropriately' led to inconsistent behavior", "No script for validation, agent had to improvise and made errors", "No guidance on OCR failure, agent gave up instead of trying alternatives" ], "instruction_following": { "winner": {