
Goals
Measure directive satisfaction from linked behavioral scenarios and executable-spec gates with ao goals.
Overview
Goals is an agent skill most often used in Ship (also Build PM and Operate iterate) that defines how ao goals measure turns linked scenario PASS results into directive satisfaction ratios.
Install
npx skills add https://github.com/boshu2/agentops --skill goalsWhat is this skill?
- scenario_satisfaction JSON with linked, satisfied, ratio, threshold, and GREEN/YELLOW/RED status
- ao goals measure --scenarios-only skips shell gates for fast executable-spec iteration
- Aggregates latest PASS/FAIL per scenario linked to each directive
- Documented default satisfaction threshold of 0.8 in policy examples
- Chains to ao scenario family artifacts per executable-spec reference doc
- scenario_satisfaction includes six fields: linked, satisfied, ratio, threshold, status
- Example policy default threshold 0.8
- Status enum: GREEN, YELLOW (linked == 0), RED (ratio below threshold)
Adoption & trust: 785 installs on skills.sh; 384 GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You have directives and behavioral scenarios but no single ratio that says whether executable specs are satisfied enough to steer or ship.
Who is it for?
AgentOps users wiring behavioral scenarios to directives and gating releases on satisfaction thresholds.
Skip if: Teams without ao goals, directives, or ao scenario artifacts who only want generic unit-test runners.
When should I use this skill?
User works with ao goals measure, directive scenario links, --scenarios-only, or scenario_satisfaction JSON from AgentOps executable-spec chain.
What do I get? / Deliverables
You get ao goals measure output with per-directive scenario_satisfaction and status colors so you can re-steer scenarios or run full gates next.
- scenario_satisfaction aggregates per directive
- CI-friendly JSON from ao goals measure
- Clear GREEN/YELLOW/RED steering signal
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Ship/testing is the canonical shelf because scenario PASS ratios and gate commands answer "are we safe to release" before full promote. Testing subphase matches behavioral scenarios, satisfaction thresholds, and optional scenarios-only CI runs without full shell gates.
Where it fits
Pre-release: ao goals measure --json to confirm ratio >= threshold on all release directives.
While drafting a directive, link scenarios and use --scenarios-only until satisfied count stabilizes.
After a RED ratio, auto re-steer (F5) or fix scenarios before the next promote.
How it compares
Executable-spec satisfaction for agent directives—not a replacement for Jest, Playwright, or human code review skills.
Common Questions / FAQ
Who is goals for?
Solo builders and small teams using AgentOps ao goals who need documented scenario_satisfaction semantics for measure and CI.
When should I use goals?
At Ship testing before release; at Build PM when linking scenarios to directives; at Operate iterate when re-steering after RED status—invoke ao goals measure per this contract.
Is goals safe to install?
Reference skill for measurement contracts—review SKILL.md and sibling gate scripts; use the Security Audits panel on this Prism page before automating measure in CI.
SKILL.md
READMESKILL.md - Goals
# Executable-Spec Chain — Reference Detailed contracts for the executable-spec layer of `ao goals`: scenario satisfaction (F2), trace/render (F4), and auto re-steer (F5). The `goals` SKILL.md links here; this file holds the precise schemas and exit-code rules. ## Scenario satisfaction (F2) `ao goals measure` aggregates the latest result of every behavioral scenario linked to a directive and computes a satisfaction ratio. The producer reads scenario result artifacts written by the `ao scenario` family. ### `scenario_satisfaction` JSON shape Every directive object in `ao goals measure --json` and `ao goals measure --directives` carries: ```jsonc "scenario_satisfaction": { "linked": 4, // count of scenarios linked to the directive "satisfied": 3, // count whose latest result artifact is PASS "ratio": 0.75, // satisfied / linked (0.0 when linked == 0) "threshold": 0.8, // directive's required ratio (default in policy) "status": "RED" // GREEN (ratio >= threshold) // YELLOW (linked == 0 — nothing to satisfy yet) // RED (ratio < threshold) } ``` ### `--scenarios-only` `ao goals measure --scenarios-only` evaluates ONLY the executable-spec layer and skips shell gate-command execution. Use it for fast iteration on scenarios without paying for the full gate suite. Combine with `-o json` for CI. ### Result-artifact resolution order Scenario results are resolved from result artifacts (ADR-0003 durability contract): 1. Promoted spec scenarios — tracked `spec/scenarios/`. 2. Ad hoc holdout scenarios — `.agents/holdout/<id>.json`. ### Exit-code semantics | Exit | Meaning | |------|---------| | 0 | All gates and all directive scenario thresholds satisfied. | | 1 | One or more gates failed, or a directive is `RED` (ratio below threshold). | | 2 | Partial result — a scenario artifact was missing or unreadable. | ## Trace and render (F4) ### `ao goals trace` Renders and audits the directive → scenario → bead → verdict → learning chain. - `--from <id>` — render the lineage tree rooted at a directive (`d-...`), scenario (`s-...`), or bead ID. Add `-o json` for a line-delimited JSON graph. - `--orphans` — audit the whole chain. Broken references are **errors**; missing downstream yields (e.g. a scenario with no verdict) are **warnings**. - `--strict` — escalate warning-class defects to a non-zero exit (ADR-0005 §4.2). Errors always exit non-zero regardless of `--strict`. Link anchors are stable directive IDs (`^d-[a-z0-9][a-z0-9-]*$`) — never the display numbers, which are not stable across edits. The full link grammar and defect taxonomy are in `docs/adr/ADR-0005`. ### `ao goals render` Exports directive-linked scenarios as a Gherkin `.feature` file: - bare — print Gherkin to stdout. - `--out <path>` — write the Gherkin to a file instead. ## Auto re-steer (F5) When a directive's scenarios fail chronically, the re-steer engine recommends a directive mutation. This is the last and most safety-gated part of the chain. ### `ao goals steer recommend` Read-only. Runs the re-steer policy engine over the verdict ledger and prints recommended directive mutations plus skip reasons. GOALS.md is never modified. ### `ao goals steer apply` Applies the top recommendation to GOALS.md. Two conditions must BOTH hold: 1. The policy's `auto_apply` is `true`. 2. The operator confirms — interactive prompt, or `--auto` / `--yes` for non-interactive scripted consent. A run without confirmation never changes GOALS.md. Every mutation routes through the non-lossy directive-block patcher (`cli/internal/goals/patcher.go`) — never `RenderGoalsMD` / `WriteMDGoals`, which are lossy full re-renders. - `--policy <path>` — re-steer policy file (default `docs/re-steer-policy.json`). - `--auto` / `--yes` — pre-confirm for non-interactive use. Policy schema, verdict-ledger format, mutation-safety invariants, and the human-gate contract are in `docs/adr/ADR-0006`.