Agent Harness Construction

Name: Agent Harness Construction
Author: affaan-m

affaan-m/everything-claude-code

Tune tool schemas, observations, and recovery paths so your coding agent completes multi-step tasks more reliably.

Overview

Agent Harness Construction is an agent skill most often used in Build (also Ship, Operate) that designs action spaces, observations, and recovery so agents converge on completion.

Install

npx skills add https://github.com/affaan-m/everything-claude-code --skill agent-harness-construction

What is this skill?

Four quality levers: action space, observations, recovery, and context budget
Granularity rules: micro-tools for deploy/migrations, medium for edit loops, macro only when round-trips dominate
Structured tool responses with status, summary, next_actions, and artifacts
Error recovery contract: root-cause hint, safe retry, explicit stop condition
Context budgeting: minimal system prompt, skills on demand, compact at phase boundaries
Four core harness quality dimensions: action space, observation, recovery, context budget

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 4.8k installs on skills.sh; 210k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

Your agent has tools and a system prompt, but it still wanders, misreads outputs, or loops on errors because the harness—not the model—is underspecified.

Who is it for?

Indie builders designing or refactoring custom agent toolkits, MCP bundles, or internal coding agents before scaling automations.

Skip if: One-shot chat without persistent tools, or teams that only need a single prebuilt integration with no custom harness.

When should I use this skill?

Improving how an agent plans, calls tools, recovers from errors, and converges on completion.

What do I get? / Deliverables

You leave with explicit tool granularity, observation shapes, and recovery contracts that make plans and tool calls predictable—ready to implement in your agent config or skill stack.

Action-space and observation contract for your tool suite
Error recovery and stop-condition checklist per tool
Context-budget plan (system vs on-demand skills)

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

BuildAgent skills & templates

Harness design is where solo builders define how the agent acts—canonical shelf is agent-tooling during product build. Action-space, observation contracts, and context budgeting are core agent-runtime design work, not one-off app features.

Also useful

ShipCode review

Also useful

OperateIteration & experiments

Where it fits

Example use

BuildAgent skills & templates

Split deploy and migration into micro-tools with narrow schemas before wiring your MCP server.

Example use

BuildIntegrations & version control

Standardize every integration response with status, summary, next_actions, and artifact paths.

Example use

ShipSecurity

Add explicit stop conditions on permission-sensitive tool errors instead of blind retries.

Example use

OperateIteration & experiments

Compact context at phase boundaries after a planning burst rather than at arbitrary token limits.

How it compares

Design discipline for tool-and-observation contracts—not a drop-in MCP server or a generic brainstorming ritual.

Common Questions / FAQ

Who is agent-harness-construction for?

Solo builders and small teams authoring agent skills, MCP tool sets, or harness configs who want fewer failed multi-step runs.

When should I use agent-harness-construction?

In Build when defining tools; in Ship when hardening deploy/migration micro-tools and error paths; in Operate when compaction and recovery at phase boundaries matter.

Is agent-harness-construction safe to install?

It is procedural documentation with no runtime permissions by itself; review the Security Audits panel on this Prism page before trusting the parent repo bundle.

SKILL.md

READMESKILL.md - Agent Harness Construction

# Agent Harness Construction

Use this skill when you are improving how an agent plans, calls tools, recovers from errors, and converges on completion.

## Core Model

Agent output quality is constrained by:
1. Action space quality
2. Observation quality
3. Recovery quality
4. Context budget quality

## Action Space Design

1. Use stable, explicit tool names.
2. Keep inputs schema-first and narrow.
3. Return deterministic output shapes.
4. Avoid catch-all tools unless isolation is impossible.

## Granularity Rules

- Use micro-tools for high-risk operations (deploy, migration, permissions).
- Use medium tools for common edit/read/search loops.
- Use macro-tools only when round-trip overhead is the dominant cost.

## Observation Design

Every tool response should include:
- `status`: success|warning|error
- `summary`: one-line result
- `next_actions`: actionable follow-ups
- `artifacts`: file paths / IDs

## Error Recovery Contract

For every error path, include:
- root cause hint
- safe retry instruction
- explicit stop condition

## Context Budgeting

1. Keep system prompt minimal and invariant.
2. Move large guidance into skills loaded on demand.
3. Prefer references to files over inlining long documents.
4. Compact at phase boundaries, not arbitrary token thresholds.

## Architecture Pattern Guidance

- ReAct: best for exploratory tasks with uncertain path.
- Function-calling: best for structured deterministic flows.
- Hybrid (recommended): ReAct planning + typed tool execution.

## Benchmarking

Track:
- completion rate
- retries per task
- pass@1 and pass@3
- cost per successful task

## Anti-Patterns

- Too many tools with overlapping semantics.
- Opaque tool output with no recovery hints.
- Error-only output without next steps.
- Context overloading with irrelevant references.

What is this skill?

Four quality levers: action space, observations, recovery, and context budget

Granularity rules: micro-tools for deploy/migrations, medium for edit loops, macro only when round-trips dominate

Structured tool responses with status, summary, next_actions, and artifacts

Error recovery contract: root-cause hint, safe retry, explicit stop condition

Context budgeting: minimal system prompt, skills on demand, compact at phase boundaries

Four core harness quality dimensions: action space, observation, recovery, context budget

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 4.8k installs on skills.sh; 210k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What do I get? / Deliverables

You leave with explicit tool granularity, observation shapes, and recovery contracts that make plans and tool calls predictable—ready to implement in your agent config or skill stack.

Action-space and observation contract for your tool suite

Error recovery and stop-condition checklist per tool

Context-budget plan (system vs on-demand skills)

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

BuildAgent skills & templates

Also useful

ShipCode review

Also useful

OperateIteration & experiments

Where it fits

Example use

BuildAgent skills & templates

Split deploy and migration into micro-tools with narrow schemas before wiring your MCP server.

Example use

BuildIntegrations & version control

Standardize every integration response with status, summary, next_actions, and artifact paths.

Example use

ShipSecurity

Add explicit stop conditions on permission-sensitive tool errors instead of blind retries.

Example use

OperateIteration & experiments

Compact context at phase boundaries after a planning burst rather than at arbitrary token limits.

SKILL.md

READMESKILL.md - Agent Harness Construction

# Agent Harness Construction

Use this skill when you are improving how an agent plans, calls tools, recovers from errors, and converges on completion.

## Core Model

Agent output quality is constrained by:
1. Action space quality
2. Observation quality
3. Recovery quality
4. Context budget quality

## Action Space Design

1. Use stable, explicit tool names.
2. Keep inputs schema-first and narrow.
3. Return deterministic output shapes.
4. Avoid catch-all tools unless isolation is impossible.

## Granularity Rules

- Use micro-tools for high-risk operations (deploy, migration, permissions).
- Use medium tools for common edit/read/search loops.
- Use macro-tools only when round-trip overhead is the dominant cost.

## Observation Design

Every tool response should include:
- `status`: success|warning|error
- `summary`: one-line result
- `next_actions`: actionable follow-ups
- `artifacts`: file paths / IDs

## Error Recovery Contract

For every error path, include:
- root cause hint
- safe retry instruction
- explicit stop condition

## Context Budgeting

1. Keep system prompt minimal and invariant.
2. Move large guidance into skills loaded on demand.
3. Prefer references to files over inlining long documents.
4. Compact at phase boundaries, not arbitrary token thresholds.

## Architecture Pattern Guidance

- ReAct: best for exploratory tasks with uncertain path.
- Function-calling: best for structured deterministic flows.
- Hybrid (recommended): ReAct planning + typed tool execution.

## Benchmarking

Track:
- completion rate
- retries per task
- pass@1 and pass@3
- cost per successful task

## Anti-Patterns

- Too many tools with overlapping semantics.
- Opaque tool output with no recovery hints.
- Error-only output without next steps.
- Context overloading with irrelevant references.

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is agent-harness-construction for?

When should I use agent-harness-construction?

Is agent-harness-construction safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is agent-harness-construction for?

When should I use agent-harness-construction?

Is agent-harness-construction safe to install?

SKILL.md