Task Quality Kpi

Name: Task Quality Kpi
Author: giuseppe-trisciuoglio

giuseppe-trisciuoglio/developer-kit

Read auto-generated TASK-XXX--kpi.json scores to decide whether a task implementation needs another iteration or is ready to approve.

Overview

Task Quality KPI is an agent skill most often used in Ship (also Build) that interprets hook-generated TASK-XXX--kpi.json metrics to decide iterate or approve.

Install

npx skills add https://github.com/giuseppe-trisciuoglio/developer-kit --skill task-quality-kpi

What is this skill?

KPIs auto-calculated by PostToolUse hook on TASK-*.md saves via task-kpi-analyzer.py
Agents read TASK-XXX--kpi.json—they do not execute analyzer scripts themselves
Replaces subjective review_status with quantitative 0–10 quality scores
Hook-on-save architecture separates measurement from evaluation decisions
Quantitative 0–10 scores per KPI dimension
Output file pattern TASK-XXX--kpi.json on TASK-*.md save

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 532 installs on skills.sh; 271 GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

Your agent marks tasks done with subjective review_status while you have no consistent, numeric signal for whether the implementation actually meets the bar.

Who is it for?

Indie builders using TASK-*.md files with PostToolUse hooks already wired to task-kpi-analyzer.py.

Skip if: Repos without TASK file conventions, hooks, or KPI JSON output—you cannot evaluate what was never generated.

When should I use this skill?

Reading KPI data for task evaluation, understanding quality metrics, or deciding whether to iterate or approve based on data.

What do I get? / Deliverables

You base iterate-or-approve calls on auto-saved 0–10 KPI scores in TASK-XXX--kpi.json after each TASK-*.md edit, without running analyzer scripts inside the skill.

Documented evaluation decision (iterate vs approve) grounded in KPI JSON
Interpretation of TASK-XXX--kpi.json metrics

Recommended Skills

Grill Memattpocock/skills

Grill Me is an agent skill that interviews you relentlessly about a plan or design until you and the agent share the sam…278k installs·121k stars

Grill With Docsmattpocock/skills

Grill With Docs is an agent skill that runs a structured grilling session on your plan: it interviews you relentlessly, …218k installs·121k stars

Brainstormingobra/superpowers

Brainstorming is a journey-wide Superpowers agent skill that turns rough ideas into approved designs through guided conv…209k installs·221k stars

Lark Tasklarksuite/cli

Lark task v2 skill for todos, tasklists, related/my task queries, attachments, and task-agent lifecycle, with guidance o…209k installs·13.7k stars

Lark Workflow Standup Reportlarksuite/cli

Lark workflow skill that pulls agenda events and incomplete tasks, then expects AI to time-convert, detect conflicts, an…208k installs·13.7k stars

Cavemanjuliusbrussee/blueprint

Caveman is a blueprint agent skill that re-encodes SPEC.md and spec-referencing writes into a terse grammar with symboli…197k installs·1k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Ship review is the canonical shelf because the skill gates merge-ready quality after task work is written, not while ideating features. Review subphase matches iterate-vs-approve decisions driven by quantitative KPIs instead of subjective checkboxes.

Also useful

BuildProject management & tracking

Where it fits

Example use

BuildProject management & tracking

After updating TASK-042.md scope, open KPI JSON to see if the agent’s implementation drifted from acceptance criteria.

Example use

ShipCode review

Before approving a TASK, confirm composite scores cleared your threshold instead of trusting a single done flag.

Example use

OperateIteration & experiments

Re-run evaluation on a reopened TASK after a production bug to see which KPI dimensions regressed.

How it compares

Use hook-fed KPI JSON instead of ad-hoc agent self-grading in chat with no persisted metrics.

Common Questions / FAQ

Who is task-quality-kpi for?

Solo builders and small teams using structured TASK-*.md workflows who want agents to read objective quality scores before approving work.

When should I use task-quality-kpi?

After saving a TASK file during Build PM tracking; during Ship review before merge; and when re-opening a task to decide if another agent iteration is warranted.

Is task-quality-kpi safe to install?

Check the Security Audits panel on this page; the skill allows Read and Write and depends on a local hook script—verify that script in your repo before enabling automation.

SKILL.md

READMESKILL.md - Task Quality Kpi

# Task Quality KPI Framework

## Overview

The **Task Quality KPI Framework** provides **objective, quantitative metrics** for evaluating task implementation quality. 

**Key Architecture**: KPIs are **auto-generated by a hook** - you read the results, not run scripts.

```
┌─────────────────────────────────────────────────────────────┐
│  HOOK (auto-executes)                                       │
│  Trigger: PostToolUse on TASK-*.md                          │
│  Script: task-kpi-analyzer.py                               │
│  Output: TASK-XXX--kpi.json                                 │
├─────────────────────────────────────────────────────────────┤
│  SKILL / AGENT (reads output)                               │
│  Input: TASK-XXX--kpi.json                                  │
│  Action: Make evaluation decisions                          │
└─────────────────────────────────────────────────────────────┘
```

### Why This Architecture?

| Problem | Solution |
|---------|----------|
| Skills can't execute scripts | Hook auto-runs on file save |
| Subjective review_status | Quantitative 0-10 scores |
| "Looks good to me" | Evidence-based evaluation |
| Binary pass/fail | Graduated quality levels |

## KPI File Location

After any task file modification, find KPI data at:

```
docs/specs/[ID]/tasks/TASK-XXX--kpi.json
```

## KPI Categories

```
┌─────────────────────────────────────────────────────────────┐
│                    OVERALL SCORE (0-10)                     │
├─────────────────────────────────────────────────────────────┤
│  Spec Compliance (30%)                                      │
│  ├── Acceptance Criteria Met (0-10)                         │
│  ├── Requirements Coverage (0-10)                           │
│  └── No Scope Creep (0-10)                                  │
├─────────────────────────────────────────────────────────────┤
│  Code Quality (25%)                                         │
│  ├── Static Analysis (0-10)                                 │
│  ├── Complexity (0-10)                                      │
│  └── Patterns Alignment (0-10)                              │
├─────────────────────────────────────────────────────────────┤
│  Test Coverage (25%)                                        │
│  ├── Unit Tests Present (0-10)                              │
│  ├── Test/Code Ratio (0-10)                                 │
│  └── Coverage Percentage (0-10)                             │
├─────────────────────────────────────────────────────────────┤
│  Contract Fulfillment (20%)                                 │
│  ├── Provides Verified (0-10)                               │
│  └── Expects Satisfied (0-10)                               │
└─────────────────────────────────────────────────────────────┘
```

### Category Weights

| Category | Weight | Why |
|----------|--------|-----|
| Spec Compliance | 30% | Most important - did we build what was asked? |
| Code Quality | 25% | Technical excellence |
| Test Coverage | 25% | Verification and confidence |
| Contract Fulfillment | 20% | Integration with other tasks |

## When to Use

- Reading KPI data for task quality evaluation
- Understanding quality metrics and scoring breakdown
- Deciding whether to iterate or approve based on quantitative data
- Integrating KPI checks into automated loops (`agents_loop.py`)
- Generating evidence-based evaluation reports

## Instructions

### 1. Reading KPI Data (Primary Use)

**DO NOT run scripts** - read the auto-generated file:

```markdown
Read the KPI file:
  docs/specs/001-feature/tasks/TASK-001--kpi.json
```

### 2. Understanding the Data

Th

What is this skill?

KPIs auto-calculated by PostToolUse hook on TASK-*.md saves via task-kpi-analyzer.py

Agents read TASK-XXX--kpi.json—they do not execute analyzer scripts themselves

Replaces subjective review_status with quantitative 0–10 quality scores

Hook-on-save architecture separates measurement from evaluation decisions

Quantitative 0–10 scores per KPI dimension

Output file pattern TASK-XXX--kpi.json on TASK-*.md save

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 532 installs on skills.sh; 271 GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

BuildProject management & tracking

Where it fits

Example use

BuildProject management & tracking

After updating TASK-042.md scope, open KPI JSON to see if the agent’s implementation drifted from acceptance criteria.

Example use

ShipCode review

Before approving a TASK, confirm composite scores cleared your threshold instead of trusting a single done flag.

Example use

OperateIteration & experiments

Re-run evaluation on a reopened TASK after a production bug to see which KPI dimensions regressed.

SKILL.md

READMESKILL.md - Task Quality Kpi

# Task Quality KPI Framework

## Overview

The **Task Quality KPI Framework** provides **objective, quantitative metrics** for evaluating task implementation quality. 

**Key Architecture**: KPIs are **auto-generated by a hook** - you read the results, not run scripts.

```
┌─────────────────────────────────────────────────────────────┐
│  HOOK (auto-executes)                                       │
│  Trigger: PostToolUse on TASK-*.md                          │
│  Script: task-kpi-analyzer.py                               │
│  Output: TASK-XXX--kpi.json                                 │
├─────────────────────────────────────────────────────────────┤
│  SKILL / AGENT (reads output)                               │
│  Input: TASK-XXX--kpi.json                                  │
│  Action: Make evaluation decisions                          │
└─────────────────────────────────────────────────────────────┘
```

### Why This Architecture?

| Problem | Solution |
|---------|----------|
| Skills can't execute scripts | Hook auto-runs on file save |
| Subjective review_status | Quantitative 0-10 scores |
| "Looks good to me" | Evidence-based evaluation |
| Binary pass/fail | Graduated quality levels |

## KPI File Location

After any task file modification, find KPI data at:

```
docs/specs/[ID]/tasks/TASK-XXX--kpi.json
```

## KPI Categories

```
┌─────────────────────────────────────────────────────────────┐
│                    OVERALL SCORE (0-10)                     │
├─────────────────────────────────────────────────────────────┤
│  Spec Compliance (30%)                                      │
│  ├── Acceptance Criteria Met (0-10)                         │
│  ├── Requirements Coverage (0-10)                           │
│  └── No Scope Creep (0-10)                                  │
├─────────────────────────────────────────────────────────────┤
│  Code Quality (25%)                                         │
│  ├── Static Analysis (0-10)                                 │
│  ├── Complexity (0-10)                                      │
│  └── Patterns Alignment (0-10)                              │
├─────────────────────────────────────────────────────────────┤
│  Test Coverage (25%)                                        │
│  ├── Unit Tests Present (0-10)                              │
│  ├── Test/Code Ratio (0-10)                                 │
│  └── Coverage Percentage (0-10)                             │
├─────────────────────────────────────────────────────────────┤
│  Contract Fulfillment (20%)                                 │
│  ├── Provides Verified (0-10)                               │
│  └── Expects Satisfied (0-10)                               │
└─────────────────────────────────────────────────────────────┘
```

### Category Weights

| Category | Weight | Why |
|----------|--------|-----|
| Spec Compliance | 30% | Most important - did we build what was asked? |
| Code Quality | 25% | Technical excellence |
| Test Coverage | 25% | Verification and confidence |
| Contract Fulfillment | 20% | Integration with other tasks |

## When to Use

- Reading KPI data for task quality evaluation
- Understanding quality metrics and scoring breakdown
- Deciding whether to iterate or approve based on quantitative data
- Integrating KPI checks into automated loops (`agents_loop.py`)
- Generating evidence-based evaluation reports

## Instructions

### 1. Reading KPI Data (Primary Use)

**DO NOT run scripts** - read the auto-generated file:

```markdown
Read the KPI file:
  docs/specs/001-feature/tasks/TASK-001--kpi.json
```

### 2. Understanding the Data

Th

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is task-quality-kpi for?

When should I use task-quality-kpi?

Is task-quality-kpi safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is task-quality-kpi for?

When should I use task-quality-kpi?

Is task-quality-kpi safe to install?

SKILL.md