
Codex Autoresearch
Run Codex in an unattended improve-verify loop for fixes, audits, debugging, and ship-readiness instead of one-off chat turns.
Overview
codex-autoresearch is a journey-wide agent skill that runs an unattended improve-verify loop in Codex CLI—usable whenever a solo builder needs autonomous iteration toward a verifiable outcome before committing.
Install
npx skills add https://github.com/leo-lilinxiao/codex-autoresearch --skill codex-autoresearchWhat is this skill?
- Autonomous Modify → Verify → Keep/Discard → Repeat loop for Codex CLI
- Request modes: loop, plan, debug, fix, security, ship, and exec
- Loads core principles, structured output spec, and runtime hard invariants for execution modes
- Session resume, environment awareness, and interaction wizard references for interactive launches
- Explicitly excludes ordinary one-shot coding help or casual Q&A
- 7 classified request modes: loop, plan, debug, fix, security, ship, exec
- 4-step activation flow: classify, load core references, load situational references, execute selected mode
Adoption & trust: 1 installs on skills.sh; 1.8k GitHub stars; 1/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
You need Codex to keep working toward a measurable goal with verify-and-rollback discipline, not stop after one reply.
Who is it for?
Overnight improve-verify runs, systematic debug/fix cycles, security passes, and ship-readiness with explicit verification.
Skip if: Ordinary one-shot coding help, casual Q&A, or tasks without a clear verify signal or user-approved autonomous scope.
When should I use this skill?
User wants Codex to plan or run an unattended improve-verify loop toward a measurable outcome, especially overnight; includes repeated debugging, fixing, security auditing, and ship-readiness—not ordinary one-shot coding
What do I get? / Deliverables
Codex runs classified long-running iterations with structured outputs and invariants until verification passes or you resume/control an existing session.
- Structured iteration output per references
- Kept changes that pass verification
- Resumable session state when using resume protocol
Recommended Skills
Journey fit
Useful at every journey phase - explore requirements and options before committing to a direction.
Where it fits
Run a fix/exec loop until tests pass after a regression you cannot close in one chat turn.
Iterate modify-verify until review criteria and structured output spec are satisfied.
Classify as security mode and load runtime invariants before an autonomous audit pass.
Use ship mode to converge on launch checklist items with verifiable completion signals.
Resume an existing session to debug recurring production failures with keep/discard on each patch.
How it compares
Codex-oriented autonomous loop orchestration—not a single-purpose linter skill or a hosted CI product.
Common Questions / FAQ
Who is codex-autoresearch for?
Solo builders using Codex CLI who want unattended or long-session iteration with explicit verify and keep/discard semantics.
When should I use codex-autoresearch?
During Build for sustained fix/debug loops; during Ship for review, security auditing, and ship-readiness; during Operate when repeated verify-fix cycles track production issues—whenever one-shot chat is not enough.
Is codex-autoresearch safe to install?
Execution modes can change code and run verification; review the Security Audits panel on this Prism page and scope permissions before unattended runs.
SKILL.md
READMESKILL.md - Codex Autoresearch
# codex-autoresearch Autonomous goal-directed iteration. Modify -> Verify -> Keep/Discard -> Repeat. ## When Activated 1. Classify the request as `loop`, `plan`, `debug`, `fix`, `security`, `ship`, or `exec`, and parse any inline config from the prompt. 2. Load `references/core-principles.md` and `references/structured-output-spec.md`. For active execution modes (`loop`, `debug`, `fix`, `security`, `ship`, `exec`), also load `references/runtime-hard-invariants.md`. 3. Load only the additional references the current situation needs: - `references/session-resume-protocol.md` for every interactive launch or existing-run control path, before deciding fresh vs resumable - `references/environment-awareness.md` before choosing hardware-sensitive work - `references/interaction-wizard.md` for every new interactive launch (`loop`, `debug`, `fix`, `security`, `ship`) before execution begins - `references/results-logging.md` only when debugging TSV/state semantics or helper behavior directly 4. Load the selected mode workflow reference plus only the detailed cross-cutting protocols that actually apply (`lessons`, `pivot`, `health-check`, `parallel`, `web-search`, `hypothesis-perspectives`). 5. Use the bundled helper scripts when stateful artifacts or runtime control are involved. Resolve them relative to the loaded skill bundle root (`<skill-root>/scripts/...`), not the target repo root. In the common repo-local install this means commands such as `python3 .agents/skills/codex-autoresearch/scripts/autoresearch_init_run.py --repo <primary_repo> --workspace-root <workspace_root> ...`. New-run helpers (`autoresearch_init_run.py` and `autoresearch_runtime_ctl.py launch/create-launch`) require both `--repo <primary_repo>` and `--workspace-root <workspace_root>`. Existing-run control-plane helpers (`autoresearch_resume_check.py`, `autoresearch_resume_prompt.py`, `autoresearch_supervisor_status.py`, `autoresearch_health_check.py`, `autoresearch_runtime_ctl.py status/stop/start`) require `--repo <primary_repo>` and resolve the workspace-owned Results directory from the repo-local pointer plus canonical context. `autoresearch_launch_gate.py --repo <primary_repo>` is the pre-wizard gate: it returns `fresh` for a clean repo with no prior artifacts and otherwise uses the same pointer/context recovery path. 6. Execute the selected workflow exactly as written and produce the required structured output and artifacts. ## Core Loop 1. Read the relevant context. 2. Define a mechanical success metric. 3. Establish a baseline. 4. Make one focused change. 5. Verify with a command. 6. Keep or discard the change. 7. Log the result. 8. Repeat. ## Modes | Mode | Purpose | Primary Reference | |------|---------|-------------------| | `loop` | Run the autonomous improvement loop | `references/loop-workflow.md` | | `plan` | Convert a vague goal into a launch-ready config | `references/plan-workflow.md` | | `debug` | Hunt bugs with evidence and hypotheses | `references/debug-workflow.md` | | `fix` | Iteratively reduce errors to zero | `references/fix-workflow.md` | | `security` | Run a structured security audit | `references/security-workflow.md` | | `ship` | Gate and execute a ship workflow | `references/ship-workflow.md` | | `exec` | Non-interactive CI/CD mode with JSON output | `references/exec-workflow.md` | Use `Mode: <name>` in the prompt to force a specific subworkflow. ## Required Config For the generic loop, the following fields are needed internally. Codex infers them fro