Scrutinize

Name: Scrutinize
Author: thananon

thananon/9arm-skills

Get an outsider second opinion on a plan, PR, or diff that traces real code paths and questions whether the change should exist at all.

Overview

scrutinize is an agent skill most often used in Ship (also Validate scope, Build pm) that performs an outsider end-to-end review questioning intent, simpler alternatives, and whether code paths match the claimed goal.

Install

npx skills add https://github.com/thananon/9arm-skills --skill scrutinize

What is this skill?

Ordered workflow: intent first, simpler-alternative check, then end-to-end code-path verification
Outsider stance—ignores author intent and reads the artifact cold
Every finding requires what to change, why, and evidence—not diff paraphrase
Hard stop when the goal cannot be stated in one sentence (underspecified artifact)
Triggers on /scrutinize and proactive review, audit, sanity-check, or second-opinion requests
Workflow runs ordered steps starting with intent, then simpler-alternative analysis, then end-to-end verification—do not

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 1.2k installs on skills.sh; 2.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You are about to merge or build from a plan that sounds reasonable in the diff but might be unnecessary, over-scoped, or wrong once you follow real call paths.

Who is it for?

Indie developers who want a structured second opinion before merging agent-generated PRs or committing to a design.

Skip if: Automated style or CVE scanning, or when you only need formatting fixes without architectural judgment.

When should I use this skill?

/scrutinize, or when asked to review, audit, sanity-check, or get a second opinion on a plan, PR, diff, design doc, or proposed code change.

What do I get? / Deliverables

You receive concise, evidence-backed actions— including stop/ simplify recommendations—after intent and full-path verification, so you merge or implement only what survives outsider scrutiny.

Concise review with actionable items (change, why, evidence)
Stop or simplify recommendation when intent is unclear or a smaller path exists

Recommended Skills

Improve Codebase Architecturemattpocock/skills

Improve Codebase Architecture is an agent skill that teaches how to deepen a cluster of shallow modules without breaking…226k installs·121k stars

Zoom Outmattpocock/skills

Lightweight meta-prompt skill that tells the agent to zoom out and deliver a domain-aligned overview of modules and call…181k installs·121k stars

Caveman Reviewjuliusbrussee/caveman

Formats code review as single actionable lines: location, problem, fix, with minimal noise.139k installs·70k stars

Requesting Code Reviewobra/superpowers

Requesting Code Review is an agent skill from the Superpowers collection that gives solo and indie builders a copy-ready…119k installs·221k stars

Receiving Code Reviewobra/superpowers

Superpowers methodology for agents receiving code review: prioritize technical correctness over social comfort, verify e…96.2k installs·221k stars

Request Refactor Planmattpocock/skills

request-refactor-plan is a structured agent workflow for solo and small-team maintainers who want refactors filed as act…30.5k installs·121k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Ship review is the canonical shelf because the skill is built for PRs, diffs, and pre-merge verification—not early ideation alone. Review subphase captures end-to-end scrutiny, call-graph tracing, and actionable findings before you merge or ship.

Also useful

ValidateScope & plan

Also useful

BuildProject management & tracking

Where it fits

Example use

ValidateScope & plan

Challenge whether a proposed feature doc is load-bearing or could be dropped for a smaller validation path.

Example use

BuildProject management & tracking

Scrutinize an implementation plan for duplicate utilities already in the codebase before the agent writes code.

Example use

ShipCode review

Trace a PR’s new branch through real handlers to confirm it fixes the stated bug without silent regressions.

How it compares

Human-style outsider review workflow—not a drop-in replacement for static analysis or test runners.

Common Questions / FAQ

Who is scrutinize for?

Solo and small-team builders using AI agents who want intent checks and call-graph verification before trusting a plan, PR, or design doc.

When should I use scrutinize?

During Ship review on PRs, at Validate when a scope doc feels heavy, and in Build when you want a sanity pass on a proposed change before coding.

Is scrutinize safe to install?

Review the Security Audits panel on this Prism page; the skill reads your repo context—grant only the workspace access your agent already uses.

SKILL.md

READMESKILL.md - Scrutinize

# Scrutinize

Stand outside the change and ask whether it should exist at all, then verify it actually does what it claims end-to-end.

## Operating stance

- **Outsider.** Forget who wrote it and why they think it's right. Read the artifact cold.
- **End-to-end, not diff-local.** The diff is the entry point, not the scope. Follow the call graph through real code paths.
- **Actionable, concise, with rationale.** Every finding states *what to change*, *why*, and *what evidence* led you there. No filler, no restating the diff back.

## Workflow

Run these in order. Do not skip ahead.

### 1. Intent — what is this actually trying to do?

- State the goal in one sentence, in your own words. If you cannot, the artifact is underspecified — say so and stop.
- Ask: **is there a simpler, smaller, or more elegant way to achieve the same goal?** Consider:
  - Doing nothing (is the problem real / load-bearing?).
  - Using something that already exists in the codebase instead of adding new surface.
  - A smaller change that solves 90% of the goal with 10% of the risk.
  - Solving it at a different layer (config vs code, framework vs app, build vs runtime).
- If a better alternative exists, name it explicitly with rationale. This is the most valuable thing you can output — surface it before the line-by-line review.

### 2. Trace — walk the actual code path

- For each behavior the change claims, trace the path end-to-end through the real code, not just the lines in the diff:
  - Entry point → call sites → branches taken → state mutated → exit / return / side effect.
  - Include the unchanged code on either side of the diff. Bugs hide at the seams.
- For a plan or design doc: trace the proposed flow against the existing system. Where does it touch reality? What does it assume that isn't true?
- Note every place the trace surprises you (unexpected branch, dead code reached, state you didn't know existed). Surprises are signal.

### 3. Verify — does it actually do what it claims?

For each claim the change/plan makes, answer:

- **Does the code path you just traced actually produce that behavior?** Walk it explicitly. "It claims X. Path: A → B → C. At C, [observation]. Therefore [holds / doesn't hold]."
- **What inputs / states would break it?** Edge cases, concurrent callers, error paths, partial failures, retries, empty/null/unicode/huge inputs, ordering assumptions.
- **What does it silently change?** Performance, error semantics, observability, contract for other callers, on-disk / on-wire format.
- **How is it tested?** Do the tests actually exercise the traced path, or do they pass while skipping it (mocks that hide the bug, asserts on intermediate state, happy path only)?

### 4. Report

Output one tight section per finding. Order by severity (blocker → major → nit). For each:

- **Finding** — one sentence, specific. Cite `file:line` when applicable.
- **Why it matters** — the consequence, not the principle.
- **Evidence** — the trace step or input that exposes it.
- **Suggested change** — concrete, minimal.

Close with a one-line verdict: ship / fix-then-ship / rework / reject — with the single biggest reason.

## Operating rules

- **No rubber-stamps.** "LGTM" is not an output. If you genuinely find nothing, say what you traced and what you checked, so the user can judge whether your review covered the surface they cared about.
- **Cite or it didn't happen.** Every claim about the code references a s

What is this skill?

Ordered workflow: intent first, simpler-alternative check, then end-to-end code-path verification

Outsider stance—ignores author intent and reads the artifact cold

Every finding requires what to change, why, and evidence—not diff paraphrase

Hard stop when the goal cannot be stated in one sentence (underspecified artifact)

Triggers on /scrutinize and proactive review, audit, sanity-check, or second-opinion requests

Workflow runs ordered steps starting with intent, then simpler-alternative analysis, then end-to-end verification—do not

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 1.2k installs on skills.sh; 2.7k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

You receive concise, evidence-backed actions— including stop/ simplify recommendations—after intent and full-path verification, so you merge or implement only what survives outsider scrutiny.

Concise review with actionable items (change, why, evidence)

Stop or simplify recommendation when intent is unclear or a smaller path exists

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

BuildProject management & tracking

Where it fits

Example use

ValidateScope & plan

Challenge whether a proposed feature doc is load-bearing or could be dropped for a smaller validation path.

Example use

BuildProject management & tracking

Scrutinize an implementation plan for duplicate utilities already in the codebase before the agent writes code.

Example use

ShipCode review

Trace a PR’s new branch through real handlers to confirm it fixes the stated bug without silent regressions.

SKILL.md

READMESKILL.md - Scrutinize

# Scrutinize

Stand outside the change and ask whether it should exist at all, then verify it actually does what it claims end-to-end.

## Operating stance

- **Outsider.** Forget who wrote it and why they think it's right. Read the artifact cold.
- **End-to-end, not diff-local.** The diff is the entry point, not the scope. Follow the call graph through real code paths.
- **Actionable, concise, with rationale.** Every finding states *what to change*, *why*, and *what evidence* led you there. No filler, no restating the diff back.

## Workflow

Run these in order. Do not skip ahead.

### 1. Intent — what is this actually trying to do?

- State the goal in one sentence, in your own words. If you cannot, the artifact is underspecified — say so and stop.
- Ask: **is there a simpler, smaller, or more elegant way to achieve the same goal?** Consider:
  - Doing nothing (is the problem real / load-bearing?).
  - Using something that already exists in the codebase instead of adding new surface.
  - A smaller change that solves 90% of the goal with 10% of the risk.
  - Solving it at a different layer (config vs code, framework vs app, build vs runtime).
- If a better alternative exists, name it explicitly with rationale. This is the most valuable thing you can output — surface it before the line-by-line review.

### 2. Trace — walk the actual code path

- For each behavior the change claims, trace the path end-to-end through the real code, not just the lines in the diff:
  - Entry point → call sites → branches taken → state mutated → exit / return / side effect.
  - Include the unchanged code on either side of the diff. Bugs hide at the seams.
- For a plan or design doc: trace the proposed flow against the existing system. Where does it touch reality? What does it assume that isn't true?
- Note every place the trace surprises you (unexpected branch, dead code reached, state you didn't know existed). Surprises are signal.

### 3. Verify — does it actually do what it claims?

For each claim the change/plan makes, answer:

- **Does the code path you just traced actually produce that behavior?** Walk it explicitly. "It claims X. Path: A → B → C. At C, [observation]. Therefore [holds / doesn't hold]."
- **What inputs / states would break it?** Edge cases, concurrent callers, error paths, partial failures, retries, empty/null/unicode/huge inputs, ordering assumptions.
- **What does it silently change?** Performance, error semantics, observability, contract for other callers, on-disk / on-wire format.
- **How is it tested?** Do the tests actually exercise the traced path, or do they pass while skipping it (mocks that hide the bug, asserts on intermediate state, happy path only)?

### 4. Report

Output one tight section per finding. Order by severity (blocker → major → nit). For each:

- **Finding** — one sentence, specific. Cite `file:line` when applicable.
- **Why it matters** — the consequence, not the principle.
- **Evidence** — the trace step or input that exposes it.
- **Suggested change** — concrete, minimal.

Close with a one-line verdict: ship / fix-then-ship / rework / reject — with the single biggest reason.

## Operating rules

- **No rubber-stamps.** "LGTM" is not an output. If you genuinely find nothing, say what you traced and what you checked, so the user can judge whether your review covered the surface they cared about.
- **Cite or it didn't happen.** Every claim about the code references a s

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is scrutinize for?

When should I use scrutinize?

Is scrutinize safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is scrutinize for?

When should I use scrutinize?

Is scrutinize safe to install?

SKILL.md