
Skill Comply
Audit agent session tool traces against expected skill workflows so solo builders can verify agents followed test-first or compliance patterns.
Overview
Skill Comply is an agent skill most often used in Ship (also Build, Operate) that checks agent session tool traces for adherence to expected skill workflows.
Install
npx skills add https://github.com/affaan-m/everything-claude-code --skill skill-complyWhat is this skill?
- Oriented around structured tool_complete session logs (Write, Bash, Edit, pytest flows)
- Supports validating whether an agent respected intended skill order—for example tests before implementation
- Fits everything-claude-code meta-tooling for skill governance on Claude Code
- Useful when replaying or comparing sessions (sess-001 vs sess-002 style traces)
- Pairs with skill authoring and eval habits in the same ecosystem
Adoption & trust: 3.7k installs on skills.sh; 210k GitHub stars; 2/3 security scanners passed (skills.sh audits).
What problem does it solve?
You cannot tell whether your coding agent actually followed test-first or skill-mandated steps after a long autonomous session.
Who is it for?
Solo builders maintaining custom Claude Code skills who log tool events and want repeatable governance on agent behavior.
Skip if: Greenfield feature work with no session capture—use planning and implementation skills first, then comply-style review.
When should I use this skill?
After agent sessions complete, when comparing tool traces for skill adherence, or when authoring skills that require ordered Write, Bash, and Edit steps.
What do I get? / Deliverables
You get a compliance-oriented read of tool_complete sequences so you can reject or fix sessions that skipped required steps before merging.
- Compliance assessment of session tool ordering
- Gap report when mandated steps were skipped or reordered
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Ship review because the skill evaluates completed agent sessions—writes, bash, edits—much like a post-run compliance check before you trust the diff. Review subphase fits trace analysis and adherence verification after coding agents finish a loop, not greenfield scaffolding.
Where it fits
Replay a pytest-fail-then-fix session log to confirm the agent did not skip the failing test step.
Validate a new custom skill’s mandated tool order using captured tool_complete JSON lines.
Spot regressions where agents write implementation before tests across recurring maintenance sessions.
How it compares
A session-trace checker for skill adherence, not a linter for application source code or a security scanner by itself.
Common Questions / FAQ
Who is skill-comply for?
Indie builders and skill authors using Claude Code who need to verify agent sessions against documented workflows using tool trace data.
When should I use skill-comply?
After Ship-phase agent runs when reviewing diffs, during Build agent-tooling when tuning skills, and during Operate iterate when agent behavior drifts—especially when comparing pytest-first versus implementation-first sessions.
Is skill-comply safe to install?
Expect filesystem and shell-adjacent workflows if your compliance checks replay local logs; review the Security Audits panel on this Prism page and inspect the full SKILL.md in the repo—the ingested readme fragment is truncated.
SKILL.md
READMESKILL.md - Skill Comply
.venv/ __pycache__/ *.py[cod] results/*.md .pytest_cache/ .coverage uv.lock {"timestamp":"2026-03-20T10:00:01Z","event":"tool_complete","tool":"Write","session":"sess-001","input":"{\"file_path\":\"tests/test_fib.py\",\"content\":\"def test_fib(): assert fib(0) == 0\"}","output":"File created"} {"timestamp":"2026-03-20T10:00:10Z","event":"tool_complete","tool":"Bash","session":"sess-001","input":"{\"command\":\"cd /tmp/sandbox && pytest tests/\"}","output":"FAILED - 1 failed"} {"timestamp":"2026-03-20T10:00:20Z","event":"tool_complete","tool":"Write","session":"sess-001","input":"{\"file_path\":\"src/fib.py\",\"content\":\"def fib(n): return n if n <= 1 else fib(n-1)+fib(n-2)\"}","output":"File created"} {"timestamp":"2026-03-20T10:00:30Z","event":"tool_complete","tool":"Bash","session":"sess-001","input":"{\"command\":\"cd /tmp/sandbox && pytest tests/\"}","output":"1 passed"} {"timestamp":"2026-03-20T10:00:40Z","event":"tool_complete","tool":"Edit","session":"sess-001","input":"{\"file_path\":\"src/fib.py\",\"old_string\":\"return n if\",\"new_string\":\"if n < 0: raise ValueError\\n return n if\"}","output":"File edited"} {"timestamp":"2026-03-20T10:00:01Z","event":"tool_complete","tool":"Write","session":"sess-002","input":"{\"file_path\":\"src/fib.py\",\"content\":\"def fib(n): return n if n <= 1 else fib(n-1)+fib(n-2)\"}","output":"File created"} {"timestamp":"2026-03-20T10:00:10Z","event":"tool_complete","tool":"Write","session":"sess-002","input":"{\"file_path\":\"tests/test_fib.py\",\"content\":\"def test_fib(): assert fib(0) == 0\"}","output":"File created"} {"timestamp":"2026-03-20T10:00:20Z","event":"tool_complete","tool":"Bash","session":"sess-002","input":"{\"command\":\"cd /tmp/sandbox && pytest tests/\"}","output":"1 passed"} id: tdd-workflow name: TDD Workflow Compliance source_rule: rules/common/testing.md version: "2.0" steps: - id: write_test description: "Write test file BEFORE implementation" required: true detector: description: "A Write or Edit to a test file (filename contains 'test')" before_step: write_impl - id: run_test_red description: "Run test and confirm FAIL (RED phase)" required: true detector: description: "Run pytest or test command that produces a FAIL/ERROR result" after_step: write_test before_step: write_impl - id: write_impl description: "Write minimal implementation (GREEN phase)" required: true detector: description: "Write or Edit an implementation file (not a test file)" after_step: run_test_red - id: run_test_green description: "Run test and confirm PASS (GREEN phase)" required: true detector: description: "Run pytest or test command that produces a PASS result" after_step: write_impl - id: refactor description: "Refactor (IMPROVE phase)" required: false detector: description: "Edit a source file for refactoring after tests pass" after_step: run_test_green scoring: threshold_promote_to_hook: 0.6 You are classifying tool calls from a coding agent session against expected behavioral steps. For each tool call, determine which step (if any) it belongs to. A tool call can match at most one step. Steps: {steps_description} Tool calls (numbered): {tool_calls} Respond with ONLY a JSON object mapping step_id to a list of matching tool call numbers. Include only steps that have at least one match. If no tool calls match a step, omit it. Example response: {"write_test": [0, 1], "run_test_red": [2], "write_impl": [3, 4]} Rules: - Match based on the MEANING of the tool call, not just keywords - A Write to "test_calculator.py" is a test file write, even if the content is implementation-like - A Write to "calculator.py" is an implementation write, even if it contains test helpers - A Bash running "pytest" that outputs "FAILED" is a RED phase test run - A Bash running "pytest" that outputs "passed" is a GREEN phase test run - Each tool ca