
Scrutinize
Get an outsider second opinion on a plan, PR, or diff that traces real code paths and questions whether the change should exist at all.
Overview
scrutinize is an agent skill most often used in Ship (also Validate scope, Build pm) that performs an outsider end-to-end review questioning intent, simpler alternatives, and whether code paths match the claimed goal.
Install
npx skills add https://github.com/thananon/9arm-skills --skill scrutinizeWhat is this skill?
- Ordered workflow: intent first, simpler-alternative check, then end-to-end code-path verification
- Outsider stance—ignores author intent and reads the artifact cold
- Every finding requires what to change, why, and evidence—not diff paraphrase
- Hard stop when the goal cannot be stated in one sentence (underspecified artifact)
- Triggers on /scrutinize and proactive review, audit, sanity-check, or second-opinion requests
- Workflow runs ordered steps starting with intent, then simpler-alternative analysis, then end-to-end verification—do not
Adoption & trust: 1.2k installs on skills.sh; 2.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You are about to merge or build from a plan that sounds reasonable in the diff but might be unnecessary, over-scoped, or wrong once you follow real call paths.
Who is it for?
Indie developers who want a structured second opinion before merging agent-generated PRs or committing to a design.
Skip if: Automated style or CVE scanning, or when you only need formatting fixes without architectural judgment.
When should I use this skill?
/scrutinize, or when asked to review, audit, sanity-check, or get a second opinion on a plan, PR, diff, design doc, or proposed code change.
What do I get? / Deliverables
You receive concise, evidence-backed actions— including stop/ simplify recommendations—after intent and full-path verification, so you merge or implement only what survives outsider scrutiny.
- Concise review with actionable items (change, why, evidence)
- Stop or simplify recommendation when intent is unclear or a smaller path exists
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Ship review is the canonical shelf because the skill is built for PRs, diffs, and pre-merge verification—not early ideation alone. Review subphase captures end-to-end scrutiny, call-graph tracing, and actionable findings before you merge or ship.
Where it fits
Challenge whether a proposed feature doc is load-bearing or could be dropped for a smaller validation path.
Scrutinize an implementation plan for duplicate utilities already in the codebase before the agent writes code.
Trace a PR’s new branch through real handlers to confirm it fixes the stated bug without silent regressions.
How it compares
Human-style outsider review workflow—not a drop-in replacement for static analysis or test runners.
Common Questions / FAQ
Who is scrutinize for?
Solo and small-team builders using AI agents who want intent checks and call-graph verification before trusting a plan, PR, or design doc.
When should I use scrutinize?
During Ship review on PRs, at Validate when a scope doc feels heavy, and in Build when you want a sanity pass on a proposed change before coding.
Is scrutinize safe to install?
Review the Security Audits panel on this Prism page; the skill reads your repo context—grant only the workspace access your agent already uses.
SKILL.md
READMESKILL.md - Scrutinize
# Scrutinize Stand outside the change and ask whether it should exist at all, then verify it actually does what it claims end-to-end. ## Operating stance - **Outsider.** Forget who wrote it and why they think it's right. Read the artifact cold. - **End-to-end, not diff-local.** The diff is the entry point, not the scope. Follow the call graph through real code paths. - **Actionable, concise, with rationale.** Every finding states *what to change*, *why*, and *what evidence* led you there. No filler, no restating the diff back. ## Workflow Run these in order. Do not skip ahead. ### 1. Intent — what is this actually trying to do? - State the goal in one sentence, in your own words. If you cannot, the artifact is underspecified — say so and stop. - Ask: **is there a simpler, smaller, or more elegant way to achieve the same goal?** Consider: - Doing nothing (is the problem real / load-bearing?). - Using something that already exists in the codebase instead of adding new surface. - A smaller change that solves 90% of the goal with 10% of the risk. - Solving it at a different layer (config vs code, framework vs app, build vs runtime). - If a better alternative exists, name it explicitly with rationale. This is the most valuable thing you can output — surface it before the line-by-line review. ### 2. Trace — walk the actual code path - For each behavior the change claims, trace the path end-to-end through the real code, not just the lines in the diff: - Entry point → call sites → branches taken → state mutated → exit / return / side effect. - Include the unchanged code on either side of the diff. Bugs hide at the seams. - For a plan or design doc: trace the proposed flow against the existing system. Where does it touch reality? What does it assume that isn't true? - Note every place the trace surprises you (unexpected branch, dead code reached, state you didn't know existed). Surprises are signal. ### 3. Verify — does it actually do what it claims? For each claim the change/plan makes, answer: - **Does the code path you just traced actually produce that behavior?** Walk it explicitly. "It claims X. Path: A → B → C. At C, [observation]. Therefore [holds / doesn't hold]." - **What inputs / states would break it?** Edge cases, concurrent callers, error paths, partial failures, retries, empty/null/unicode/huge inputs, ordering assumptions. - **What does it silently change?** Performance, error semantics, observability, contract for other callers, on-disk / on-wire format. - **How is it tested?** Do the tests actually exercise the traced path, or do they pass while skipping it (mocks that hide the bug, asserts on intermediate state, happy path only)? ### 4. Report Output one tight section per finding. Order by severity (blocker → major → nit). For each: - **Finding** — one sentence, specific. Cite `file:line` when applicable. - **Why it matters** — the consequence, not the principle. - **Evidence** — the trace step or input that exposes it. - **Suggested change** — concrete, minimal. Close with a one-line verdict: ship / fix-then-ship / rework / reject — with the single biggest reason. ## Operating rules - **No rubber-stamps.** "LGTM" is not an output. If you genuinely find nothing, say what you traced and what you checked, so the user can judge whether your review covered the surface they cared about. - **Cite or it didn't happen.** Every claim about the code references a s