
Task Quality Kpi
Read auto-generated TASK-XXX--kpi.json scores to decide whether a task implementation needs another iteration or is ready to approve.
Overview
Task Quality KPI is an agent skill most often used in Ship (also Build) that interprets hook-generated TASK-XXX--kpi.json metrics to decide iterate or approve.
Install
npx skills add https://github.com/giuseppe-trisciuoglio/developer-kit --skill task-quality-kpiWhat is this skill?
- KPIs auto-calculated by PostToolUse hook on TASK-*.md saves via task-kpi-analyzer.py
- Agents read TASK-XXX--kpi.json—they do not execute analyzer scripts themselves
- Replaces subjective review_status with quantitative 0–10 quality scores
- Hook-on-save architecture separates measurement from evaluation decisions
- Quantitative 0–10 scores per KPI dimension
- Output file pattern TASK-XXX--kpi.json on TASK-*.md save
Adoption & trust: 532 installs on skills.sh; 271 GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your agent marks tasks done with subjective review_status while you have no consistent, numeric signal for whether the implementation actually meets the bar.
Who is it for?
Indie builders using TASK-*.md files with PostToolUse hooks already wired to task-kpi-analyzer.py.
Skip if: Repos without TASK file conventions, hooks, or KPI JSON output—you cannot evaluate what was never generated.
When should I use this skill?
Reading KPI data for task evaluation, understanding quality metrics, or deciding whether to iterate or approve based on data.
What do I get? / Deliverables
You base iterate-or-approve calls on auto-saved 0–10 KPI scores in TASK-XXX--kpi.json after each TASK-*.md edit, without running analyzer scripts inside the skill.
- Documented evaluation decision (iterate vs approve) grounded in KPI JSON
- Interpretation of TASK-XXX--kpi.json metrics
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Ship review is the canonical shelf because the skill gates merge-ready quality after task work is written, not while ideating features. Review subphase matches iterate-vs-approve decisions driven by quantitative KPIs instead of subjective checkboxes.
Where it fits
After updating TASK-042.md scope, open KPI JSON to see if the agent’s implementation drifted from acceptance criteria.
Before approving a TASK, confirm composite scores cleared your threshold instead of trusting a single done flag.
Re-run evaluation on a reopened TASK after a production bug to see which KPI dimensions regressed.
How it compares
Use hook-fed KPI JSON instead of ad-hoc agent self-grading in chat with no persisted metrics.
Common Questions / FAQ
Who is task-quality-kpi for?
Solo builders and small teams using structured TASK-*.md workflows who want agents to read objective quality scores before approving work.
When should I use task-quality-kpi?
After saving a TASK file during Build PM tracking; during Ship review before merge; and when re-opening a task to decide if another agent iteration is warranted.
Is task-quality-kpi safe to install?
Check the Security Audits panel on this page; the skill allows Read and Write and depends on a local hook script—verify that script in your repo before enabling automation.
SKILL.md
READMESKILL.md - Task Quality Kpi
# Task Quality KPI Framework ## Overview The **Task Quality KPI Framework** provides **objective, quantitative metrics** for evaluating task implementation quality. **Key Architecture**: KPIs are **auto-generated by a hook** - you read the results, not run scripts. ``` ┌─────────────────────────────────────────────────────────────┐ │ HOOK (auto-executes) │ │ Trigger: PostToolUse on TASK-*.md │ │ Script: task-kpi-analyzer.py │ │ Output: TASK-XXX--kpi.json │ ├─────────────────────────────────────────────────────────────┤ │ SKILL / AGENT (reads output) │ │ Input: TASK-XXX--kpi.json │ │ Action: Make evaluation decisions │ └─────────────────────────────────────────────────────────────┘ ``` ### Why This Architecture? | Problem | Solution | |---------|----------| | Skills can't execute scripts | Hook auto-runs on file save | | Subjective review_status | Quantitative 0-10 scores | | "Looks good to me" | Evidence-based evaluation | | Binary pass/fail | Graduated quality levels | ## KPI File Location After any task file modification, find KPI data at: ``` docs/specs/[ID]/tasks/TASK-XXX--kpi.json ``` ## KPI Categories ``` ┌─────────────────────────────────────────────────────────────┐ │ OVERALL SCORE (0-10) │ ├─────────────────────────────────────────────────────────────┤ │ Spec Compliance (30%) │ │ ├── Acceptance Criteria Met (0-10) │ │ ├── Requirements Coverage (0-10) │ │ └── No Scope Creep (0-10) │ ├─────────────────────────────────────────────────────────────┤ │ Code Quality (25%) │ │ ├── Static Analysis (0-10) │ │ ├── Complexity (0-10) │ │ └── Patterns Alignment (0-10) │ ├─────────────────────────────────────────────────────────────┤ │ Test Coverage (25%) │ │ ├── Unit Tests Present (0-10) │ │ ├── Test/Code Ratio (0-10) │ │ └── Coverage Percentage (0-10) │ ├─────────────────────────────────────────────────────────────┤ │ Contract Fulfillment (20%) │ │ ├── Provides Verified (0-10) │ │ └── Expects Satisfied (0-10) │ └─────────────────────────────────────────────────────────────┘ ``` ### Category Weights | Category | Weight | Why | |----------|--------|-----| | Spec Compliance | 30% | Most important - did we build what was asked? | | Code Quality | 25% | Technical excellence | | Test Coverage | 25% | Verification and confidence | | Contract Fulfillment | 20% | Integration with other tasks | ## When to Use - Reading KPI data for task quality evaluation - Understanding quality metrics and scoring breakdown - Deciding whether to iterate or approve based on quantitative data - Integrating KPI checks into automated loops (`agents_loop.py`) - Generating evidence-based evaluation reports ## Instructions ### 1. Reading KPI Data (Primary Use) **DO NOT run scripts** - read the auto-generated file: ```markdown Read the KPI file: docs/specs/001-feature/tasks/TASK-001--kpi.json ``` ### 2. Understanding the Data Th