Evolve

Operate/iterate is the canonical shelf because evolve is a compounding production-improvement loop, not a one-shot feature generator. Iterate fits continuous measure-fix-remeasure cycles tied to GOALS.md deltas and operator cadence after shipped work.

Also useful

Also useful

Where it fits

Example use

After deploy, run evolve to close the loop on the worst reliability or quality goal in GOALS.md.

Example use

Finish a post-mortem on a release and let evolve select the next fix validated through RPI.

Example use

When the backlog is stale, evolve analyzes the repo and spawns the highest-value RPI work item.

How it compares

AgentOps workflow skill for local compounding loops—not a hosted managed-agent grader and not a one-off code review checklist.

Common Questions / FAQ

Who is evolve for?

Developers using boshu2/agentops who want terminal-native autonomous improvement with RPI, post-mortem, and compile dependencies.

When should I use evolve?

In Operate when iterating on production-adjacent repos; after Ship when harvesting post-mortem findings; in Build PM when turning analysis into the next prioritized work item via supervised loops.

Is evolve safe to install?

It declares code-changing output and shell-oriented operator flows—review permissions, repo backups, and the Security Audits panel on this page before enabling autonomous loops.

Workflow Chain

Requires first: rpi, post mortem

Then invoke: rpi

SKILL.md

READMESKILL.md - Evolve

# /evolve — Goal-Driven Compounding Loop

> **Cross-vendor analog:** Anthropic Managed Agents Outcomes (May 2026). Both close the loop "agent runs → grader scores against a rubric → agent retries"; AgentOps does it locally against any model.

> Measure what's wrong. Fix the worst thing. Measure again. Compound.

**V2 command surface:** keep the name `evolve`. Use `ao evolve` for the
terminal-native loop. It is the top-level operator entrypoint for
`ao rpi loop --supervisor`, preserving the old `/evolve` concept while reusing
the v2 RPI loop engine.

**Operator cadence:** post-mortem finished work, analyze the current repo state,
select or create the next highest-value work item, let `/rpi` handle research,
planning, pre-mortem, implementation, and validation, then harvest follow-ups
and repeat until a kill switch, max-cycle cap, regression breaker, or real
dormancy stops the run.

Always-on autonomous loop over `/rpi`. Work selection order:
1. **Harvested `.agents/rpi/next-work.jsonl` work** (freshest concrete follow-up)
2. **Open ready beads work** (`bd ready`)
3. **Failing goals and directive gaps** (`ao goals measure`)
4. **Testing improvements** (missing/thin coverage, missing regression tests)
5. **Validation tightening and bug-hunt passes** (gates, audits, bug sweeps)
6. **Complexity / TODO / FIXME / drift / dead code / stale docs / stale research mining**
7. **Concrete feature suggestions** derived from repo purpose when no sharper work exists

**Work generators** that feed the selection ladder (auto-invoked, skip with `--no-lifecycle`):
- `Skill(skill="test", args="coverage")` → files with <40% coverage become queue items (Step 3.4)
- `Skill(skill="refactor", args="--sweep all --dry-run")` → functions with CC > 20 become queue items (Step 3.6)
- `Skill(skill="deps", args="audit")` → deps with CVSS >= 7.0 or 2+ major versions behind become queue items (Step 3.5)
- `Skill(skill="perf", args="profile --quick")` → perf findings become queue items when hot paths detected (Step 3.5)

**Dormancy is last resort.** Empty current queues mean "run the generator layers", not "stop". Only go dormant after the queue layers and generator layers come up empty across multiple consecutive passes.

```bash
/evolve                      # Run until kill switch, max-cycles, or real dormancy
/evolve --max-cycles=5       # Cap at 5 cycles
/evolve --dry-run            # Show what would be worked on, don't execute
/evolve --beads-only         # Skip goals measurement, work beads backlog only
/evolve --quality            # Quality-first mode: prioritize post-mortem findings
/evolve --quality --max-cycles=10  # Quality mode with cycle cap
/evolve --compile            # Mine → Defrag warmup before first cycle
/evolve --compile --max-cycles=5 # Warm knowledge base then run 5 cycles
/evolve --test-first         # Default strict-quality /rpi execution path
/evolve --no-test-first      # Explicit opt-out from test-first mode
```

## Delineation vs /dream

| Lane | Runs | Mutates code? | Mutates corpus? | Outer loop? | Budget |
|------|------|---------------|-----------------|-------------|--------|
| `/dream` | nightly, private local | **No** | **Yes (heavy)** | **Yes (convergence)** | wall-clock + plateau |
| `/evolve` | daytime, operator-driven | Yes (via `/rpi`) | Yes (light) | Yes | cycle cap |

Dream owns the knowledge compou

What is this skill?

Top-level `ao evolve` entrypoint wrapping RPI supervisor loops

Measure what's wrong, fix the worst thing, measure again—explicit compounding model

Integrates post-mortem harvest, repo analysis, and `/rpi` for research through validation

Output contract: code changes plus GOALS.md fitness deltas

Kill switches: max-cycle cap, regression breaker, and operator stop conditions

Declared dependencies: rpi, post-mortem, compile

Output contract includes GOALS.md fitness deltas

Compatible agents: Claude Code, Codex, any compatible agent

Adoption & trust: 865 installs on skills.sh; 384 GitHub stars; 2/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

The repo receives iterated code changes and updated GOALS.md fitness until you hit a kill switch, cycle cap, or regression breaker—then invoke rpi-aligned follow-ups from harvested items.

Iterative code changes from supervised cycles

Updated GOALS.md fitness deltas

Harvested follow-up work items for the next loop

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

Also useful

Where it fits

Example use

After deploy, run evolve to close the loop on the worst reliability or quality goal in GOALS.md.

Example use

Finish a post-mortem on a release and let evolve select the next fix validated through RPI.

Example use