
Complexity Detection
Score a task on five axes so Cavekit can right-size budgets, model choice, and review depth before sketch, map, or make.
Overview
complexity-detection is a journey-wide agent skill that scores tasks on five 0–4 axes (0–20 total) so Cavekit can right-size depth, models, and review.
Install
npx skills add https://github.com/juliusbrussee/cavekit --skill complexity-detectionWhat is this skill?
- Deterministic 5-axis scoring rubric summed 0–20 (files, type, judgment, cross-component, novelty)
- Maps totals to quick, standard, or thorough depth for downstream Cavekit commands
- Shared by /ck:sketch (kit default), /ck:map (per-task depth), and /ck:make (per-task budgets)
- Dedicated ck:complexity agent path documented with haiku model routing
- Trigger phrases: how complex, what depth, pick a depth, classify this task
- 5 scoring axes from 0–4 each
- 0–20 total complexity sum
- 3 depth bands: quick, standard, thorough
Adoption & trust: 5 installs on skills.sh; 1k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You cannot tell whether a task is a quick chore or a cross-repo architectural change, so budgets and model routing stay misaligned.
Who is it for?
Builders using the Cavekit /ck commands who want deterministic depth before sketching kits, mapping tasks, or executing make.
Skip if: Teams not on Cavekit who need human story-point estimation without the five-axis rubric.
When should I use this skill?
Trigger phrases: "how complex", "what depth", "pick a depth", "classify this task"; used by /ck:sketch, /ck:map, /ck:make, and ck:complexity.
What do I get? / Deliverables
You get a quick, standard, or thorough classification that /ck:sketch, /ck:map, /ck:make, and ck:complexity use for consistent pipeline sizing.
- Quick, standard, or thorough depth label
- Per-axis 0–4 scores and 0–20 total
Recommended Skills
Journey fit
Useful at every journey phase - explore requirements and options before committing to a direction.
Where it fits
Default kit complexity in /ck:sketch before generating task maps for a new feature.
Choose thorough review depth when judgment axis scores critical security or production risk.
Classify a hotfix as mechanical vs cross-cutting before assigning agent budget for the patch.
Decide whether a prototype spike is quick or research-heavy before committing build time.
How it compares
Use instead of gut-feel 'small/medium/large' labels when routing agent work through Cavekit.
Common Questions / FAQ
Who is complexity-detection for?
Solo builders and agent operators using Julius Brussee's Cavekit pipeline who need shared depth labels before sketch, map, or make.
When should I use complexity-detection?
At the start of a Build task, before Ship review depth is chosen, or during Operate when classifying incident-fix scope—anytime you ask how complex, what depth, pick a depth, or classify this task.
Is complexity-detection safe to install?
It is a scoring rubric with no external calls by itself; review the Security Audits panel on this page for the parent Cavekit repo permissions.
SKILL.md
READMESKILL.md - Complexity Detection
# Complexity Detection A deterministic scoring rubric. Five axes, 0–4 each, summed to 0–20. ## Axes | Axis | 0 | 1 | 2 | 3 | 4 | |-----------------------|------------------|-------------------|-------------------|----------------------|---------------------| | **Files touched** | 0–2 | 3–5 | 6–10 | 11–20 | 20+ | | **Type** | chore / format | refactor | feature | cross-cutting | architectural | | **Judgment required** | mechanical | low-ambiguity | medium | high | critical (sec/prod) | | **Cross-component** | single module | two modules | three modules | many within one repo | multi-repo | | **Novelty** | known pattern | rare pattern | novel | research needed | unknown unknowns | Total score maps to: | Score | Depth | |--------|------------| | 0 – 6 | quick | | 7 – 13 | standard | | 14+ | thorough | ## Override signals Upgrade one step regardless of score when any of these are true: - Security-sensitive: authentication, authorization, crypto, secrets, PII. - Data migration that is not reversible. - Public API shape change (breaking). - Performance-critical hot path with an existing SLA. Downgrade one step only when **all** of these are true: - Zero new dependencies. - Existing tests cover the change. - No user-visible behaviour change. - Single file, single function. ## Per-depth defaults | Depth | Token budget | Model tier | Review | Tests | |----------|--------------|------------|-----------|-----------------------------| | quick | 8 000 | haiku | optional | smoke | | standard | 20 000 | sonnet | required | unit + integration | | thorough | 45 000 | sonnet/opus| mandatory | unit + integration + E2E | These defaults are recorded in `.cavekit/config.json` under `task_budgets` and consumed by the `cavekit-router.cjs` model router. ## How agents score The `ck:complexity` subagent (haiku) receives a task description and returns a JSON blob: ```json { "score": 11, "depth": "standard", "axes": { "files": 2, "type": 2, "judgment": 2, "cross_component": 2, "novelty": 3 }, "overrides_applied": [] } ``` `/ck:map` calls this agent per task to set `depth` in the task registry. If the agent produces a score in the "thorough" band with a novelty of 4 and a security override, it may return `needs_research: true`, which `/ck:map` must translate into an upstream `ck:researcher` task dependency before the work itself. ## Integration points - `/ck:sketch` — runs complexity scoring on the whole domain to set the kit's `complexity:` frontmatter. - `/ck:map` — runs it per task to assign `depth`. - `/ck:make` — reads `depth` to size the task budget and pick the review intensity. - `ck:complexity` agent — pure-haiku worker; does nothing else. ## Anti-patterns - Using one depth for every task in a kit "for consistency." Cost up, signal down. - Padding depth to "be safe" — if the budget is oversized, the model wastes tokens exploring. Right-size, then raise only when verification fails. - Ignoring overrides — scoring a login flow as "quick" because it touches one file. Security overrides exist for exact