
Implement Task
Drive feature implementation from a task spec file with sub-agents, LLM-as-Judge checks, and resume/refine CLI modes until steps pass.
Overview
Implement Task is an agent skill most often used in Build (also Ship) that implements steps from a task spec using sub-agents and LLM-as-Judge verification until critical checkpoints pass.
Install
npx skills add https://github.com/neolabhq/context-engineering-kit --skill implement-taskWhat is this skill?
- Task-file driven implementation with automated LLM-as-Judge verification on critical steps
- CLI modes: --continue (resume with state judge), --refine (git-diff affected steps only), --human-in-the-loop step list
- Mandate to iterate with implementation agents until issues clear or stop is critically necessary
- Parses task-file path plus optional flags from $ARGUMENTS without pausing for non-critical questions
- Multi-step pipeline: launch implementer, run judges, fix, then advance
- Automated LLM-as-Judge verification on critical implementation steps
- Three CLI control modes: --continue, --refine, and --human-in-the-loop
Adoption & trust: 536 installs on skills.sh; 1.1k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your agent stops after partial implementation or skips quality checks because no one enforces judged verification between task steps.
Who is it for?
Solo builders with feature-level .md task specs who want autonomous implement-and-verify loops in an existing git repo.
Skip if: Greenfield ideas without a written task breakdown, or teams that only need a single-file snippet without multi-step agent orchestration.
When should I use this skill?
You have a task specification file and need the agent to implement steps with judged verification, optional human pauses, and resume or git-aware refine behavior.
What do I get? / Deliverables
Steps from the task file are implemented iteratively with judge-approved artifacts, optional human gates, and --continue or --refine recovery paths tied to git state.
- Implemented code changes per task steps
- Judge-verified critical artifacts before step advance
- Resumable progress when using --continue or incremental --refine
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Primary value is executing coded tasks from structured specs with agent orchestration—canonical shelf is Build agent-tooling, with verification bleeding into Ship. Agent-tooling fits sub-agent launches, judge loops, and task-file-driven implementation rather than one-shot code generation.
Where it fits
Run implement-task against add-validation.feature.md with judges after each critical acceptance step.
Resume a stalled API migration using --continue so the judge verifies repo state before the next sub-agent pass.
Hold release until LLM-as-Judge scores pass on artifacts listed as critical in the task file.
How it compares
Use instead of free-form 'implement this feature' chat when you need judged checkpoints and resumable task-file execution—not a lightweight code generator.
Common Questions / FAQ
Who is implement-task for?
Indie developers and agent-native teams who already write structured task files and want verified, multi-step implementation with sub-agents.
When should I use implement-task?
Use it in Build agent-tooling when executing a task spec end-to-end, and in Ship testing when LLM-as-Judge gates must pass before merging or releasing.
Is implement-task safe to install?
Review the Security Audits panel on this Prism page; the skill drives implementation agents that may use shell, git, and filesystem access in your repo.
SKILL.md
READMESKILL.md - Implement Task
# Implement Task with Verification Your job is to implement solution in best quality using task specification and sub-agents. You MUST NOT stop until it critically neccesary or you are done! Avoid asking questions until it is critically neccesary! Launch implementation agent, judges, iterate till issues are fixed and then move to next step! Execute task implementation steps with automated quality verification using LLM-as-Judge for critical artifacts. ## User Input ```text $ARGUMENTS ``` --- ## Command Arguments Parse the following arguments from `$ARGUMENTS`: ### Argument Definitions | Argument | Format | Default | Description | |----------|--------|---------|-------------| | `task-file` | Path or filename | Auto-detect | Task file name or path (e.g., `add-validation.feature.md`) | | `--continue` | `--continue` | None | Continue implementation from last completed step. Launches judge first to verify state, then iterates with implementation agent. | | `--refine` | `--refine` | `false` | Incremental refinement mode - detect changes against git and re-implement only affected steps (from modified step onwards). | | `--human-in-the-loop` | `--human-in-the-loop [step1,step2,...]` | None | Steps after which to pause for human verification. If no steps specified, pauses after every step. | | `--target-quality` | `--target-quality X.X` or `--target-quality X.X,Y.Y` | `4.0` (standard) / `4.5` (critical) | Target threshold value (out of 5.0). Single value sets both. Two comma-separated values set standard,critical. | | `--max-iterations` | `--max-iterations N` | `3` | Maximum fix→verify cycles per step. Default is 3 iterations. Set to `unlimited` for no limit. | | `--skip-judges` | `--skip-judges` | `false` | Skip all judge validation checks - steps proceed without quality gates. | ### Configuration Resolution Parse `$ARGUMENTS` and resolve configuration as follows: ``` # Extract task file (first positional argument, optional - auto-detect if not provided) TASK_FILE = first argument that is a file path or filename # Parse --target-quality (supports single value or two comma-separated values) if --target-quality has single value X.X: THRESHOLD_FOR_STANDARD_COMPONENTS = X.X THRESHOLD_FOR_CRITICAL_COMPONENTS = X.X elif --target-quality has two values X.X,Y.Y: THRESHOLD_FOR_STANDARD_COMPONENTS = X.X THRESHOLD_FOR_CRITICAL_COMPONENTS = Y.Y else: THRESHOLD_FOR_STANDARD_COMPONENTS = 4.0 # default THRESHOLD_FOR_CRITICAL_COMPONENTS = 4.5 # default # Initialize other defaults MAX_ITERATIONS = --max-iterations || 3 # default is 3 iterations HUMAN_IN_THE_LOOP_STEPS = --human-in-the-loop || [] (empty = none, "*" = all) SKIP_JUDGES = --skip-judges || false REFINE_MODE = --refine || false CONTINUE_MODE = --continue || false # Special handling for --human-in-the-loop without step list if --human-in-the-loop present without step numbers: HUMAN_IN_THE_LOOP_STEPS = "*" (all steps) ``` ### Context Resolution for `--continue` When `--continue` is used: 1. **Step Resolution:** - Parse the task file for `[DONE]` markers on step titles - Identify the last incompleted step - Launch judge to verify the last INCOMPLETE step's artifacts - If judge PASS: Mark step as done and resume from the next step - If judge FAIL: Re-implement the step and iterate until PASS 2. **State Recovery:** - Check task file location (`in-progress/`, `todo/`, `done/`) - If in `todo/`, move to `in-progress/` before continuing - Pre-populate captured values from existing artifacts ### Refine Mode Behavior (`--refine`) When `--refine` is used, it detects changes to **project files** (not the task file) and maps them to implementation steps to determine what needs re-verification. 1. **Detect Chan