Goals

Name: Goals
Author: boshu2

boshu2/agentops

Measure directive satisfaction from linked behavioral scenarios and executable-spec gates with ao goals.

Overview

Goals is an agent skill most often used in Ship (also Build PM and Operate iterate) that defines how ao goals measure turns linked scenario PASS results into directive satisfaction ratios.

Install

npx skills add https://github.com/boshu2/agentops --skill goals

What is this skill?

scenario_satisfaction JSON with linked, satisfied, ratio, threshold, and GREEN/YELLOW/RED status
ao goals measure --scenarios-only skips shell gates for fast executable-spec iteration
Aggregates latest PASS/FAIL per scenario linked to each directive
Documented default satisfaction threshold of 0.8 in policy examples
Chains to ao scenario family artifacts per executable-spec reference doc
scenario_satisfaction includes six fields: linked, satisfied, ratio, threshold, status
Example policy default threshold 0.8
Status enum: GREEN, YELLOW (linked == 0), RED (ratio below threshold)

Compatible agents: Claude Code, Codex, any compatible agent

Adoption & trust: 785 installs on skills.sh; 384 GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You have directives and behavioral scenarios but no single ratio that says whether executable specs are satisfied enough to steer or ship.

Who is it for?

AgentOps users wiring behavioral scenarios to directives and gating releases on satisfaction thresholds.

Skip if: Teams without ao goals, directives, or ao scenario artifacts who only want generic unit-test runners.

When should I use this skill?

User works with ao goals measure, directive scenario links, --scenarios-only, or scenario_satisfaction JSON from AgentOps executable-spec chain.

What do I get? / Deliverables

You get ao goals measure output with per-directive scenario_satisfaction and status colors so you can re-steer scenarios or run full gates next.

scenario_satisfaction aggregates per directive
CI-friendly JSON from ao goals measure
Clear GREEN/YELLOW/RED steering signal

Recommended Skills

Grill Memattpocock/skills

Grill Me is an agent skill that interviews you relentlessly about a plan or design until you and the agent share the sam…278k installs·121k stars

Grill With Docsmattpocock/skills

Grill With Docs is an agent skill that runs a structured grilling session on your plan: it interviews you relentlessly, …218k installs·121k stars

Brainstormingobra/superpowers

Brainstorming is a journey-wide Superpowers agent skill that turns rough ideas into approved designs through guided conv…209k installs·221k stars

Lark Tasklarksuite/cli

Lark task v2 skill for todos, tasklists, related/my task queries, attachments, and task-agent lifecycle, with guidance o…209k installs·13.7k stars

Lark Workflow Standup Reportlarksuite/cli

Lark workflow skill that pulls agenda events and incomplete tasks, then expects AI to time-convert, detect conflicts, an…208k installs·13.7k stars

Cavemanjuliusbrussee/blueprint

Caveman is a blueprint agent skill that re-encodes SPEC.md and spec-referencing writes into a terse grammar with symboli…197k installs·1k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Ship/testing is the canonical shelf because scenario PASS ratios and gate commands answer "are we safe to release" before full promote. Testing subphase matches behavioral scenarios, satisfaction thresholds, and optional scenarios-only CI runs without full shell gates.

Also useful

BuildProject management & tracking

Also useful

OperateIteration & experiments

Where it fits

Example use

ShipTesting & QA

Pre-release: ao goals measure --json to confirm ratio >= threshold on all release directives.

Example use

BuildProject management & tracking

While drafting a directive, link scenarios and use --scenarios-only until satisfied count stabilizes.

Example use

OperateIteration & experiments

After a RED ratio, auto re-steer (F5) or fix scenarios before the next promote.

How it compares

Executable-spec satisfaction for agent directives—not a replacement for Jest, Playwright, or human code review skills.

Common Questions / FAQ

Who is goals for?

Solo builders and small teams using AgentOps ao goals who need documented scenario_satisfaction semantics for measure and CI.

When should I use goals?

At Ship testing before release; at Build PM when linking scenarios to directives; at Operate iterate when re-steering after RED status—invoke ao goals measure per this contract.

Is goals safe to install?

Reference skill for measurement contracts—review SKILL.md and sibling gate scripts; use the Security Audits panel on this Prism page before automating measure in CI.

SKILL.md

READMESKILL.md - Goals

# Executable-Spec Chain — Reference

Detailed contracts for the executable-spec layer of `ao goals`: scenario
satisfaction (F2), trace/render (F4), and auto re-steer (F5). The `goals`
SKILL.md links here; this file holds the precise schemas and exit-code rules.

## Scenario satisfaction (F2)

`ao goals measure` aggregates the latest result of every behavioral scenario
linked to a directive and computes a satisfaction ratio. The producer reads
scenario result artifacts written by the `ao scenario` family.

### `scenario_satisfaction` JSON shape

Every directive object in `ao goals measure --json` and
`ao goals measure --directives` carries:

```jsonc
"scenario_satisfaction": {
  "linked": 4,         // count of scenarios linked to the directive
  "satisfied": 3,      // count whose latest result artifact is PASS
  "ratio": 0.75,       // satisfied / linked (0.0 when linked == 0)
  "threshold": 0.8,    // directive's required ratio (default in policy)
  "status": "RED"      // GREEN (ratio >= threshold)
                       // YELLOW (linked == 0 — nothing to satisfy yet)
                       // RED (ratio < threshold)
}
```

### `--scenarios-only`

`ao goals measure --scenarios-only` evaluates ONLY the executable-spec layer and
skips shell gate-command execution. Use it for fast iteration on scenarios
without paying for the full gate suite. Combine with `-o json` for CI.

### Result-artifact resolution order

Scenario results are resolved from result artifacts (ADR-0003 durability
contract):

1. Promoted spec scenarios — tracked `spec/scenarios/`.
2. Ad hoc holdout scenarios — `.agents/holdout/<id>.json`.

### Exit-code semantics

| Exit | Meaning |
|------|---------|
| 0 | All gates and all directive scenario thresholds satisfied. |
| 1 | One or more gates failed, or a directive is `RED` (ratio below threshold). |
| 2 | Partial result — a scenario artifact was missing or unreadable. |

## Trace and render (F4)

### `ao goals trace`

Renders and audits the directive → scenario → bead → verdict → learning chain.

- `--from <id>` — render the lineage tree rooted at a directive (`d-...`),
  scenario (`s-...`), or bead ID. Add `-o json` for a line-delimited JSON graph.
- `--orphans` — audit the whole chain. Broken references are **errors**;
  missing downstream yields (e.g. a scenario with no verdict) are **warnings**.
- `--strict` — escalate warning-class defects to a non-zero exit (ADR-0005
  §4.2). Errors always exit non-zero regardless of `--strict`.

Link anchors are stable directive IDs (`^d-[a-z0-9][a-z0-9-]*$`) — never the
display numbers, which are not stable across edits. The full link grammar and
defect taxonomy are in `docs/adr/ADR-0005`.

### `ao goals render`

Exports directive-linked scenarios as a Gherkin `.feature` file:

- bare — print Gherkin to stdout.
- `--out <path>` — write the Gherkin to a file instead.

## Auto re-steer (F5)

When a directive's scenarios fail chronically, the re-steer engine recommends a
directive mutation. This is the last and most safety-gated part of the chain.

### `ao goals steer recommend`

Read-only. Runs the re-steer policy engine over the verdict ledger and prints
recommended directive mutations plus skip reasons. GOALS.md is never modified.

### `ao goals steer apply`

Applies the top recommendation to GOALS.md. Two conditions must BOTH hold:

1. The policy's `auto_apply` is `true`.
2. The operator confirms — interactive prompt, or `--auto` / `--yes` for
   non-interactive scripted consent.

A run without confirmation never changes GOALS.md. Every mutation routes through
the non-lossy directive-block patcher (`cli/internal/goals/patcher.go`) — never
`RenderGoalsMD` / `WriteMDGoals`, which are lossy full re-renders.

- `--policy <path>` — re-steer policy file (default `docs/re-steer-policy.json`).
- `--auto` / `--yes` — pre-confirm for non-interactive use.

Policy schema, verdict-ledger format, mutation-safety invariants, and the
human-gate contract are in `docs/adr/ADR-0006`.

What is this skill?

scenario_satisfaction JSON with linked, satisfied, ratio, threshold, and GREEN/YELLOW/RED status

ao goals measure --scenarios-only skips shell gates for fast executable-spec iteration

Aggregates latest PASS/FAIL per scenario linked to each directive

Documented default satisfaction threshold of 0.8 in policy examples

Chains to ao scenario family artifacts per executable-spec reference doc

scenario_satisfaction includes six fields: linked, satisfied, ratio, threshold, status

Example policy default threshold 0.8

Status enum: GREEN, YELLOW (linked == 0), RED (ratio below threshold)

Compatible agents: Claude Code, Codex, any compatible agent

Adoption & trust: 785 installs on skills.sh; 384 GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

BuildProject management & tracking

Also useful

OperateIteration & experiments

Where it fits

Example use

ShipTesting & QA

Pre-release: ao goals measure --json to confirm ratio >= threshold on all release directives.

Example use

BuildProject management & tracking

While drafting a directive, link scenarios and use --scenarios-only until satisfied count stabilizes.

Example use

OperateIteration & experiments

After a RED ratio, auto re-steer (F5) or fix scenarios before the next promote.

SKILL.md

READMESKILL.md - Goals

# Executable-Spec Chain — Reference

Detailed contracts for the executable-spec layer of `ao goals`: scenario
satisfaction (F2), trace/render (F4), and auto re-steer (F5). The `goals`
SKILL.md links here; this file holds the precise schemas and exit-code rules.

## Scenario satisfaction (F2)

`ao goals measure` aggregates the latest result of every behavioral scenario
linked to a directive and computes a satisfaction ratio. The producer reads
scenario result artifacts written by the `ao scenario` family.

### `scenario_satisfaction` JSON shape

Every directive object in `ao goals measure --json` and
`ao goals measure --directives` carries:

```jsonc
"scenario_satisfaction": {
  "linked": 4,         // count of scenarios linked to the directive
  "satisfied": 3,      // count whose latest result artifact is PASS
  "ratio": 0.75,       // satisfied / linked (0.0 when linked == 0)
  "threshold": 0.8,    // directive's required ratio (default in policy)
  "status": "RED"      // GREEN (ratio >= threshold)
                       // YELLOW (linked == 0 — nothing to satisfy yet)
                       // RED (ratio < threshold)
}
```

### `--scenarios-only`

`ao goals measure --scenarios-only` evaluates ONLY the executable-spec layer and
skips shell gate-command execution. Use it for fast iteration on scenarios
without paying for the full gate suite. Combine with `-o json` for CI.

### Result-artifact resolution order

Scenario results are resolved from result artifacts (ADR-0003 durability
contract):

1. Promoted spec scenarios — tracked `spec/scenarios/`.
2. Ad hoc holdout scenarios — `.agents/holdout/<id>.json`.

### Exit-code semantics

| Exit | Meaning |
|------|---------|
| 0 | All gates and all directive scenario thresholds satisfied. |
| 1 | One or more gates failed, or a directive is `RED` (ratio below threshold). |
| 2 | Partial result — a scenario artifact was missing or unreadable. |

## Trace and render (F4)

### `ao goals trace`

Renders and audits the directive → scenario → bead → verdict → learning chain.

- `--from <id>` — render the lineage tree rooted at a directive (`d-...`),
  scenario (`s-...`), or bead ID. Add `-o json` for a line-delimited JSON graph.
- `--orphans` — audit the whole chain. Broken references are **errors**;
  missing downstream yields (e.g. a scenario with no verdict) are **warnings**.
- `--strict` — escalate warning-class defects to a non-zero exit (ADR-0005
  §4.2). Errors always exit non-zero regardless of `--strict`.

Link anchors are stable directive IDs (`^d-[a-z0-9][a-z0-9-]*$`) — never the
display numbers, which are not stable across edits. The full link grammar and
defect taxonomy are in `docs/adr/ADR-0005`.

### `ao goals render`

Exports directive-linked scenarios as a Gherkin `.feature` file:

- bare — print Gherkin to stdout.
- `--out <path>` — write the Gherkin to a file instead.

## Auto re-steer (F5)

When a directive's scenarios fail chronically, the re-steer engine recommends a
directive mutation. This is the last and most safety-gated part of the chain.

### `ao goals steer recommend`

Read-only. Runs the re-steer policy engine over the verdict ledger and prints
recommended directive mutations plus skip reasons. GOALS.md is never modified.

### `ao goals steer apply`

Applies the top recommendation to GOALS.md. Two conditions must BOTH hold:

1. The policy's `auto_apply` is `true`.
2. The operator confirms — interactive prompt, or `--auto` / `--yes` for
   non-interactive scripted consent.

A run without confirmation never changes GOALS.md. Every mutation routes through
the non-lossy directive-block patcher (`cli/internal/goals/patcher.go`) — never
`RenderGoalsMD` / `WriteMDGoals`, which are lossy full re-renders.

- `--policy <path>` — re-steer policy file (default `docs/re-steer-policy.json`).
- `--auto` / `--yes` — pre-confirm for non-interactive use.

Policy schema, verdict-ledger format, mutation-safety invariants, and the
human-gate contract are in `docs/adr/ADR-0006`.

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is goals for?

When should I use goals?

Is goals safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is goals for?

When should I use goals?

Is goals safe to install?

SKILL.md