Self Improving Agent

Name: Self Improving Agent
Author: charon-fan

charon-fan/agent-playbook

32.4k installs
65 repo stars
Updated June 21, 2026
charon-fan/agent-playbook

Self-Improving Agent is a skill system for AI agents to learn from all interactions, accumulate patterns into semantic/episodic memory, and continuously improve their own capabilities.

About

Self-Improving Agent is a universal self-improvement system for AI agents that learns from all skill experiences. Agents use it to accumulate patterns, detect and fix guidance errors, and continuously evolve their own capabilities. It matters because agents can validate and improve their own instructions over time - detected patterns become semantic memory, failed assumptions get corrected with evidence markers, and reusable knowledge promotes into skill updates.

Multi-memory architecture: semantic + episodic + working memory
Auto-triggered hooks on skill start/complete/error for continuous learning
Evolution markers with source attribution for traceable improvements

Self Improving Agent by the numbers

32,384 all-time installs (skills.sh)
+315 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #39 of 16,659 AI & Agent Building skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

From the docs

What self-improving-agent says it does

Auto-triggers on skill completion/error with hooks-based self-correction

SKILL.md

npx skills add https://github.com/charon-fan/agent-playbook --skill self-improving-agent

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/charon-fan/agent-playbook/self-improving-agent.svg)](https://skillselion.com/skills/charon-fan/agent-playbook/self-improving-agent)

Installs	32.4k
repo stars	★ 65
Security audit	1 / 3 scanners passed
Last updated	June 21, 2026
Repository	charon-fan/agent-playbook ↗

What it does

Continuous agent improvement through multi-memory architecture capturing semantic patterns and episodic experiences from skill execution.

Who is it for?

Agents that need to improve over time; capturing lessons from skill execution; detecting and fixing guidance errors; building agent knowledge bases

Skip if: Teams without agent hook support or projects that only need one-off prompts without longitudinal agent memory.

When should I use this skill?

Running agent skills repeatedly; want agents to learn from failures; need to track what works across sessions; improving agent instructions over time

What you get

Hook scripts, stderr telemetry logs, and patterns JSON entries such as prd_document_separation for future sessions.

hook scripts
patterns json
session telemetry logs

By the numbers

Provides 3 hook types: PreToolUse, PostToolUse, and session end
Patterns store includes prd_document_separation pattern entry

Files

SKILL.mdMarkdownGitHub ↗

Self-Improving Agent

"An AI agent that learns from every interaction, accumulating patterns and insights to continuously improve its own capabilities." — Based on 2025 lifelong learning research

Overview

This is a universal self-improvement system that learns from ALL skill experiences, not just PRDs. It implements a complete feedback loop with:

Multi-Memory Architecture: Semantic + Episodic + Working memory
Self-Correction: Detects and fixes skill guidance errors
Self-Validation: Periodically verifies skill accuracy
Hooks Integration: Auto-triggers on skill events (before_start, after_complete, on_error)
Evolution Markers: Traceable changes with source attribution

Research-Based Design

Based on 2025 research:

Research	Key Insight	Application
SimpleMem	Efficient lifelong memory	Pattern accumulation system
Multi-Memory Survey	Semantic + Episodic memory	World knowledge + experiences
Lifelong Learning	Continuous task stream learning	Learn from every skill use
Evo-Memory	Test-time lifelong learning	Real-time adaptation

The Self-Improvement Loop

┌─────────────────────────────────────────────────────────────────┐
│                    UNIVERSAL SELF-IMPROVEMENT                    │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│   Skill Event → Extract Experience → Abstract Pattern → Update  │
│        │                  │                │         │          │
│        ▼                  ▼                ▼         ▼          │
│   ┌─────────────────────────────────────────────────────┐       │
│   │              MULTI-MEMORY SYSTEM                      │       │
│   ├─────────────────────────────────────────────────────┤       │
│   │  Semantic Memory   │  Episodic Memory  │ Working Memory │  │
│   │  (Patterns/Rules)  │  (Experiences)    │  (Current)     │  │
│   │  memory/semantic/  │  memory/episodic/ │  memory/working/│  │
│   └─────────────────────────────────────────────────────┘       │
│                                                                 │
│   ┌─────────────────────────────────────────────────────┐       │
│   │              FEEDBACK LOOP                            │       │
│   │  User Feedback → Confidence Update → Pattern Adapt   │       │
│   └─────────────────────────────────────────────────────┘       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

When This Activates

Automatic Triggers (via hooks)

Event	Trigger	Action
before_start	Any skill starts	Log session start
after_complete	Any skill completes	Extract patterns, update skills
on_error	Bash returns non-zero exit	Capture error context, trigger self-correction

Manual Triggers

User says "自我进化", "self-improve", "从经验中学习"
User says "分析今天的经验", "总结教训"
User asks to improve a specific skill

Evolution Priority Matrix

Trigger evolution when new reusable knowledge appears:

Trigger	Target Skill	Priority	Action
New PRD pattern discovered	prd-planner	High	Add to quality checklist
Architecture tradeoff clarified	architecting-solutions	High	Add to decision patterns
API design rule learned	api-designer	High	Update template
Debugging fix discovered	debugger	High	Add to anti-patterns
Review checklist gap	code-reviewer	High	Add checklist item
Perf/security insight	performance-engineer, security-auditor	High	Add to patterns
UI/UX spec issue	prd-planner, architecting-solutions	High	Add visual spec requirements
React/state pattern	debugger, refactoring-specialist	Medium	Add to patterns
Test strategy improvement	test-automator, qa-expert	Medium	Update approach
CI/deploy fix	deployment-engineer	Medium	Add to troubleshooting

Multi-Memory Architecture

1. Semantic Memory (`memory/semantic-patterns.json`)

Stores abstract patterns and rules reusable across contexts:

{
  "patterns": {
    "pattern_id": {
      "id": "pat-2025-01-11-001",
      "name": "Pattern Name",
      "source": "user_feedback|implementation_review|retrospective",
      "confidence": 0.95,
      "applications": 5,
      "created": "2025-01-11",
      "category": "prd_structure|react_patterns|async_patterns|...",
      "pattern": "One-line summary",
      "problem": "What problem does this solve?",
      "solution": { ... },
      "quality_rules": [ ... ],
      "target_skills": [ ... ]
    }
  }
}

2. Episodic Memory (`memory/episodic/`)

Stores specific experiences and what happened:

memory/episodic/
├── 2025/
│   ├── 2025-01-11-prd-creation.json
│   ├── 2025-01-11-debug-session.json
│   └── 2025-01-12-refactoring.json

{
  "id": "ep-2025-01-11-001",
  "timestamp": "2025-01-11T10:30:00Z",
  "skill": "debugger",
  "situation": "User reported data not refreshing after form submission",
  "root_cause": "Empty callback in onRefresh prop",
  "solution": "Implement actual refresh logic in callback",
  "lesson": "Always verify callbacks are not empty functions",
  "related_pattern": "callback_verification",
  "user_feedback": {
    "rating": 8,
    "comments": "This was exactly the issue"
  }
}

3. Working Memory (`memory/working/`)

Stores current session context:

memory/working/
├── current_session.json   # Active session data
├── last_error.json        # Error context for self-correction
└── session_end.json       # Session end marker

Self-Improvement Process

Phase 1: Experience Extraction

After any skill completes, extract:

What happened:
  skill_used: {which skill}
  task: {what was being done}
  outcome: {success|partial|failure}

Key Insights:
  what_went_well: [what worked]
  what_went_wrong: [what didn't work]
  root_cause: {underlying issue if applicable}

User Feedback:
  rating: {1-10 if provided}
  comments: {specific feedback}

Phase 2: Pattern Abstraction

Convert experiences to reusable patterns:

Concrete Experience	Abstract Pattern	Target Skill
"User forgot to save PRD notes"	"Always persist thinking to files"	prd-planner
"Code review missed SQL injection"	"Add security checklist item"	code-reviewer
"Callback was empty, didn't work"	"Verify callback implementations"	debugger
"Net APY position ambiguous"	"UI specs need exact relative positions"	prd-planner

Abstraction Rules:

If experience_repeats 3+ times:
  pattern_level: critical
  action: Add to skill's "Critical Mistakes" section

If solution_was_effective:
  pattern_level: best_practice
  action: Add to skill's "Best Practices" section

If user_rating >= 7:
  pattern_level: strength
  action: Reinforce this approach

If user_rating <= 4:
  pattern_level: weakness
  action: Add to "What to Avoid" section

Phase 3: Skill Updates

Update the appropriate skill files with evolution markers:

<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger -->

## Pattern Added (2025-01-12)

**Pattern**: Always verify callbacks are not empty functions

**Source**: Episode ep-2025-01-12-001

**Confidence**: 0.95

### Updated Checklist
- [ ] Verify all callbacks have implementations
- [ ] Test callback execution paths

Correction Markers (when fixing wrong guidance):

<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh -->

## Corrected Guidance

Use direct state monitoring instead of callback chains:

// ✅ Do: Direct state monitoring const prevPendingCount = usePrevious(pendingCount);

Phase 4: Memory Consolidation

1. Update semantic memory (memory/semantic-patterns.json) 2. Store episodic memory (memory/episodic/YYYY-MM-DD-{skill}.json) 3. Update pattern confidence based on applications/feedback 4. Prune outdated patterns (low confidence, no recent applications)

Promotion Policy

Self-improvement has two separate jobs:

1. Capture facts, corrections, failed assumptions, and reusable patterns as memory or proposal artifacts. 2. Promote only validated patterns into SKILL.md, AGENTS.md, docs, or CLI behavior.

Default to capture-first. Promote a change only when one of these is true:

The user explicitly asks to update a skill or repository instruction.
The same pattern recurs across multiple episodes.
A focused test or review proves the current guidance is wrong or incomplete.
The change is low-risk documentation that preserves existing behavior and is clearly traceable.

Promotion targets:

Artifact	Use For	Approval Level
`memory/episodic/*.json`	Raw episode facts and signals	Auto
`memory/semantic-patterns.json`	Candidate reusable patterns with confidence	Auto
`memory/proposals/*.md`	Proposed skill/doc/code changes with evidence	Auto
`SKILL.md` / `references/`	Validated workflow guidance	Ask first unless user requested editing
`AGENTS.md` / repo rules	Cross-repo behavior or hard constraints	Ask first
CLI/runtime code	Automation semantics	Require tests

Self-Correction (on_error hook)

Triggered when:

Bash command returns non-zero exit code
Tests fail after following skill guidance
User reports the guidance produced incorrect results

Process:

## Self-Correction Workflow

1. Detect Error
   - Capture error context from working/last_error.json
   - Identify which skill guidance was followed

2. Verify Root Cause
   - Was the skill guidance incorrect?
   - Was the guidance misinterpreted?
   - Was the guidance incomplete?

3. Create Proposal
   - Write a proposal with evidence, affected skill names, and expected behavior
   - Add correction marker text in the proposal, not directly in the skill yet
   - Update related patterns in semantic memory with low initial confidence

4. Validate Fix
   - Test the corrected guidance
   - Ask user to verify

5. Promote
   - Apply the skill/doc/code change after validation or explicit approval
   - Keep the source episode/proposal id in the change note

Example:

<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time -->

## Self-Correction: Click-Time Computation

**Issue**: Using useMemo for claimable IDs caused stale data
**Fix**: Compute at click time for always-fresh data
**Pattern**: click_time_vs_open_time_computation

Self-Validation

Use the validation template in references/appendix.md when reviewing updates.

Hooks Integration

Runtime Trigger Source

agent-playbook self-improve reads skill chaining from each skill's SKILL.md frontmatter:

metadata:
  hooks:
    after_complete:
      - trigger: self-improving-agent
        mode: background
        reason: "Extract patterns"

Treat metadata.hooks as the source of truth. Do not maintain a second hardcoded hook map in runtime code. This keeps skill behavior auditable and lets Skill Creator style reviews inspect the same file that the agent executes.

Wiring Hooks in Claude Code Settings

For Claude Code, install hooks through agent-playbook init --hooks when possible. If you need manual setup, add hook entries to Claude Code settings at the appropriate user or project scope.

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\""
          }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\""
          }
        ]
      }
    ],
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh"
          }
        ]
      }
    ]
  }
}

Replace ${SKILLS_DIR} with your actual skills path.

Additional References

See references/appendix.md for memory structure, workflow diagrams, metrics, feedback templates, and research links.

Best Practices

DO

✅ Learn from EVERY skill interaction
✅ Extract patterns at the right abstraction level
✅ Update multiple related skills
✅ Track confidence and apply counts
✅ Ask for user feedback on improvements
✅ Use evolution/correction markers for traceability
✅ Validate guidance before applying broadly
✅ Write proposals before mutating durable skill guidance
✅ Keep hook routing in metadata.hooks

DON'T

❌ Over-generalize from single experiences
❌ Update skills without confidence tracking
❌ Ignore negative feedback
❌ Make changes that break existing functionality
❌ Create contradictory patterns
❌ Update skills without understanding context
❌ Silently promote self-improvement findings into repo rules
❌ Duplicate hook definitions in CLI code and skill frontmatter

Quick Start

After a high-signal skill workflow completes, this agent can:

1. Analyzes what happened 2. Extracts patterns and insights 3. Writes memory and proposal artifacts 4. Promotes validated improvements only when approval or evidence is sufficient 5. Reports summary to user

References

#!/usr/bin/env bash
set -euo pipefail

tool_output="${1:-}"
exit_code="${2:-0}"

echo "[self-improving-agent] PostToolUse: exit=${exit_code}" >&2
if [[ "${SELF_IMPROVING_AGENT_DEBUG:-0}" == "1" && -n "${tool_output}" ]]; then
  printf '[self-improving-agent] Output length: %s bytes\n' "${#tool_output}" >&2
fi

#!/usr/bin/env bash
set -euo pipefail

tool_name="${1:-unknown}"
tool_input="${2:-}"

echo "[self-improving-agent] PreToolUse: ${tool_name}" >&2
if [[ "${SELF_IMPROVING_AGENT_DEBUG:-0}" == "1" && -n "${tool_input}" ]]; then
  printf '[self-improving-agent] Input length: %s bytes\n' "${#tool_input}" >&2
fi

{
  "patterns": {
    "prd_document_separation": {
      "id": "pat-2025-01-11-001",
      "name": "Document Separation for Complex PRDs",
      "source": "user_feedback",
      "confidence": 0.95,
      "applications": 0,
      "created": "2025-01-11",
      "category": "prd_structure",
      "pattern": "For non-trivial PRDs, split into 4 files with clear purposes",
      "problem": "Single large PRD file (~500 lines) with mixed product/technical content is hard to follow",
      "solution": {
        "files": [
          {
            "name": "{name}-notes.md",
            "purpose": "Thinking process, options analysis",
            "audience": "Self + future reviewers"
          },
          {
            "name": "{name}-task-plan.md",
            "purpose": "Project tracking, phases, progress",
            "audience": "PM + development lead"
          },
          {
            "name": "{name}-prd.md",
            "purpose": "Product requirements (what & why)",
            "audience": "PM + stakeholders + developers"
          },
          {
            "name": "{name}-tech.md",
            "purpose": "Technical design (how)",
            "audience": "Developers + architects"
          }
        ]
      },
      "quality_rules": [
        "PRD focuses on problem, goals, scope, user flows",
        "Tech doc focuses on API, data flow, implementation",
        "Notes document architecture options with A/B/C analysis",
        "Task plan has checkboxes with timestamps",
        "PRD references tech doc, doesn't duplicate"
      ],
      "target_skills": ["prd-planner", "architecting-solutions"]
    },
    "state_monitoring_over_callbacks": {
      "id": "pat-2025-01-11-002",
      "name": "Direct State Monitoring vs Callbacks",
      "source": "implementation_review",
      "confidence": 0.90,
      "applications": 0,
      "created": "2025-01-11",
      "category": "react_patterns",
      "pattern": "Prefer direct state monitoring over callback chains for side effects",
      "problem": "Callback chains passed through multiple layers are hard to trace and debug",
      "solution": {
        "anti_pattern": "useActionQueue({ onRefresh: () => { /* refresh logic */ } });",
        "pattern": "const pendingCount = requests.length;\nconst prevPendingCount = usePrevious(pendingCount);\nuseEffect(() => {\n  if (pendingCount < prevPendingCount) {\n    triggerDataRefresh({ reason: 'completed' });\n  }\n}, [pendingCount, prevPendingCount]);"
      },
      "when_to_use": [
        "State changes need to trigger side effects",
        "Callback chain would be 3+ layers deep",
        "Multiple components need to react to same state change"
      ],
      "quality_rules": [
        "Use usePrevious to detect state changes instead of callbacks when feasible",
        "Keep state monitoring close to where state is consumed",
        "Use callbacks only for cross-component boundaries"
      ],
      "target_skills": ["debugger", "refactoring-specialist"]
    },
    "state_machine_over_booleans": {
      "id": "pat-2025-01-11-003",
      "name": "State Machine Over Boolean Flags",
      "source": "implementation_review",
      "confidence": 0.85,
      "applications": 0,
      "created": "2025-01-11",
      "category": "async_patterns",
      "pattern": "Use state machines for async operations with multiple phases",
      "problem": "Simple boolean flags can't represent 'waiting to run' vs 'currently running', causing race conditions",
      "solution": {
        "anti_pattern": "const inFlight = false;",
        "pattern": "enum EStatus {\n  Idle = 'idle',\n  Waiting = 'waiting',  // Scheduled but not running yet\n  Running = 'running',\n}"
      },
      "benefits": [
        "Prevents race conditions (can't schedule new request while running)",
        "Distinguishes 'waiting to run' from 'currently running'",
        "Easier to debug and log state transitions"
      ],
      "quality_rules": [
        "Use state machine for async operations with multiple phases",
        "Prevent state transitions that don't make sense",
        "Log state transitions for debugging"
      ],
      "target_skills": ["debugger", "api-designer"]
    },
    "measurable_success_criteria": {
      "id": "pat-2025-01-11-004",
      "name": "Measurable Success Criteria",
      "source": "user_feedback",
      "confidence": 0.90,
      "applications": 0,
      "created": "2025-01-11",
      "category": "prd_quality",
      "pattern": "Success criteria must include specific numbers/timings to enable verification",
      "problem": "Vague success criteria like 'data refreshes' don't enable testing or verification",
      "solution": {
        "bad_examples": [
          "Data refreshes after transaction",
          "Manual refresh works",
          "No performance regression"
        ],
        "good_examples": [
          "Dashboard data refreshes within 3-5 seconds after a pending action completes",
          "Manual refresh button triggers full refresh and shows loading state",
          "API response time under 500ms for 95th percentile"
        ]
      },
      "quality_rules": [
        "Success criteria include specific numbers/timings",
        "Each criterion is objectively verifiable",
        "Performance targets have percentiles (e.g., 95th, 99th)",
        "User-facing behavior has observable indicators"
      ],
      "target_skills": ["prd-planner", "architecting-solutions"]
    },
    "non_goals_section": {
      "id": "pat-2025-01-11-005",
      "name": "Non-Goals Section",
      "source": "user_feedback",
      "confidence": 0.90,
      "applications": 0,
      "created": "2025-01-11",
      "category": "prd_structure",
      "pattern": "Explicitly state what won't be done to prevent scope creep",
      "problem": "Without explicit non-goals, scope creeps during implementation",
      "solution": {
        "structure": "## Goals\n- [Specific achievable outcomes]\n\n## Non-Goals\n- [Explicit exclusions - things that might seem related but aren't]"
      },
      "quality_rules": [
        "Goals section has 3-5 focused items",
        "Non-goals section explicitly excludes reasonable-but-out-of-scope items",
        "Each non-goal has a brief rationale if not obvious"
      ],
      "target_skills": ["prd-planner", "architecting-solutions"]
    },
    "ui_ux_specification_granularity": {
      "id": "pat-2025-01-11-006",
      "name": "UI/UX Specification Granularity",
      "source": "retrospective",
      "confidence": 0.95,
      "applications": 0,
      "created": "2025-01-11",
      "category": "ui_patterns",
      "pattern": "UI/UX PRDs require explicit visual specifications to prevent rework",
      "problem": "Ambiguous UI specs (position, size, spacing) cause implementation rework",
      "solution": {
        "required_elements": {
          "layout_structure": ["Relative position: same row / next row / below / above", "Parent-child container relationships", "Spacing values (gap, padding, margin)"],
          "component_specs": ["Icon/Button sizes: iconSize=\"$4\" (24px)", "Text styles: size=\"$bodyMd\", color=\"$textSubdued\"", "Component variants: size=\"small\", variant=\"tertiary\""],
          "visual_comparison": "Before/After ASCII art showing layout change",
          "executable_criteria": "Checklist with exact prop values"
        },
        "examples": {
          "bad": "Refresh button next to amount",
          "good": "Refresh button in same XStack as amount with gap='$3'"
        }
      },
      "quality_rules": [
        "Relative position explicitly stated (same row/next row/below/above)",
        "Component sizes with exact values (iconSize prop or px)",
        "Spacing values defined (gap=\"$3\", mx=\"$2\")",
        "Before/After visual comparison included",
        "Success criteria are executable (verify by reading code)",
        "Mobile vs desktop differences explicitly called out"
      ],
      "target_skills": ["prd-planner", "architecting-solutions"]
    },
    "reuse_existing_infrastructure": {
      "id": "pat-2025-01-11-007",
      "name": "Reuse Existing Infrastructure",
      "source": "comparison_analysis",
      "confidence": 0.90,
      "applications": 0,
      "created": "2025-01-11",
      "category": "architecture",
      "pattern": "Always check if Context/Provider already has the data before adding new fetching",
      "problem": "Adding duplicate data fetching creates redundant network calls and complexity",
      "solution": {
        "anti_pattern": "const { pendingRequests } = useActionQueue({ workspaceId, userId, client }); // Creates new polling loop!",
        "pattern": "const { pendingRequests } = useFeatureContext(); // Shared provider already updates this"
      },
      "quality_checklist": [
        "Check if Context/Provider already has the data",
        "Verify no duplicate polling/fetching",
        "Confirm single source of truth",
        "Only add new fetching when lifecycle is truly independent"
      ],
      "benefits": ["Reduces network/background calls", "Better performance (no redundant work)", "Single source of truth", "Simpler code (fewer hooks to manage)"],
      "target_skills": ["architecting-solutions", "api-designer", "debugger"]
    },
    "click_time_vs_open_time_computation": {
      "id": "pat-2025-01-11-008",
      "name": "Click-Time vs Open-Time Computation",
      "source": "implementation_review",
      "confidence": 0.85,
      "applications": 0,
      "created": "2025-01-11",
      "category": "react_patterns",
      "pattern": "For mutable state, compute at action time, not at render/init time",
      "problem": "Open-time computation creates stale snapshots when state changes before user acts",
      "solution": {
        "anti_pattern": "const allIds = useMemo(() =>\n  actionableItems.filter(i => !inFlightIds.includes(i.id)),\n  [actionableItems, inFlightIds]\n); // Stale if inFlightIds changes before user clicks",
        "pattern": "onRunAll: (ids: string[]) => Promise<void> => {\n  const freshIds = actionableItems\n    .filter(i => !inFlightIds.includes(i.id))\n    .map(i => i.id);\n  return submitBatchAction(freshIds);\n}"
      },
      "decision_matrix": {
        "open_time": ["Immutable data", "Expensive computation"],
        "click_time": ["Mutable state", "User-dependent filters"]
      },
      "benefits": ["State is always fresh when user acts", "No stale data issues", "Simpler reasoning about state"],
      "target_skills": ["debugger", "api-designer"]
    },
    "search_before_creating_components": {
      "id": "pat-2025-01-11-009",
      "name": "Search Before Creating Components",
      "source": "prud_correction",
      "confidence": 0.90,
      "applications": 0,
      "created": "2025-01-11",
      "category": "development",
      "pattern": "ALWAYS search existing codebase before proposing new components/types",
      "problem": "Creating duplicate components creates maintenance burden and UI inconsistency",
      "solution": {
        "pre_prd_search": [
          "grep -r \"Alert\" packages/kit/src/views/ --include=\"*.tsx\"",
          "grep -r \"IAlert\\|Alert\" packages/shared/types/ --include=\"*.ts\"",
          "If found, read existing implementation"
        ],
        "decision_matrix": {
          "existing_component_matches_ui": "Reuse",
          "existing_component_needs_small_tweak": "Extend or wrap",
          "existing_component_has_wrong_responsibilities": "Create new",
          "not_sure": "Reuse first"
        }
      },
      "impact": {
        "duplicate_component": "Over-engineering, UI inconsistency",
        "reuse": "Faster implementation, shared improvements"
      },
      "target_skills": ["prd-planner", "architecting-solutions", "api-designer"]
    },
    "spacing_and_divider_debugging": {
      "id": "pat-2025-01-11-010",
      "name": "Spacing and Divider Debugging",
      "source": "bug_analysis",
      "confidence": 0.85,
      "applications": 0,
      "created": "2025-01-11",
      "category": "debugging",
      "pattern": "When debugging spacing/divider issues, audit all spacing values systematically",
      "problem": "Component spacing (mt, mb, py, padding) can create unintended visual separators that appear as extra lines",
      "solution": {
        "debugging_steps": [
          "Search for spacing-related props in components",
          "Check for StyleSheet.hairlineWidth usage (may render differently per platform)",
          "Compare components that work vs components that have issues",
          "Draw component structure to identify spacing conflicts"
        ],
        "audit_template": "| Element | Before | After | Unit | Notes |\\n|---------|--------|-------|------|-------|\\n| Trigger padding | `py=\"$3\"` | - | 12px | Accordion.Trigger |\\n| Header top margin | `mt=\"$3\"` | `mt=\"$0\"` | 12px → 0px | Remove this |"
      },
      "quality_rules": [
        "Include ASCII diagram showing component structure",
        "List exact spacing values with pixel conversions ($3 = 12px, $5 = 20px)",
        "Compare working vs broken components",
        "Note platform-specific behaviors (hairlineWidth varies)",
        "Verify fix on all platforms (iOS, Android, Desktop, Web)"
      ],
      "target_skills": ["debugger"]
    }
  },
  "meta": {
    "version": "1.0.0",
    "last_updated": "2025-01-12",
    "total_patterns": 10,
    "categories": ["prd_structure", "prd_quality", "react_patterns", "async_patterns", "ui_patterns", "architecture", "development", "debugging"]
  }
}

Self-Improving Agent

A self-improvement system that captures learning artifacts from skill experiences and proposes validated updates.

Overview

This agent captures reusable evidence from skill interactions. It implements a feedback loop with memory artifacts, self-correction proposals, and evolution markers. Durable skill or code changes still require validation or explicit approval.

Key Features

Multi-Memory Architecture: Semantic + Episodic + Working memory
Evidence-Gated Learning: Captures reusable lessons from skill workflows
Pattern Extraction: Converts experiences into reusable patterns
Self-Correction: Fixes skill guidance when errors occur
Self-Validation: Periodically verifies skill accuracy
Proposal Artifacts: Writes proposed updates before durable skill changes
Confidence Tracking: Measures pattern reliability over time
Human-in-the-Loop: Collects feedback to validate improvements

Memory System

Current Claude Code hook integration writes to:

~/.claude/memory/
├── semantic/       # Patterns, rules, best practices
├── episodic/       # Specific experiences and episodes
└── working/        # Current session context

How It Works

Any Skill Completes
        ↓
Extract Experience → Identify Patterns → Write Proposals → Consolidate Memory
        ↓                     ↓                  ↓              ↓
   What happened?    What can we reuse?   Which proposals? Track metrics

Installation

apb skills add ./skills/self-improving-agent --scope global --target all --link

Hooks (Optional)

Wire hooks to capture errors and session-end signals:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Write|Edit",
        "hooks": [
          { "type": "command", "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\"" }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash",
        "hooks": [
          { "type": "command", "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\"" }
        ]
      }
    ],
    "Stop": [
      {
        "matcher": "",
        "hooks": [
          { "type": "command", "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh" }
        ]
      }
    ]
  }
}

Triggering

Host-Supported Follow-up

When the host runtime supports hook follow-ups, this skill can be recorded or run after high-signal workflows such as:

prd-planner
code-reviewer
debugger
refactoring-specialist
etc.

Manual

"自我进化"
"self-improve"
"分析今天的经验"
"总结这次教训"

Example Learning

Episode

Skill: debugger
Situation: Form submission doesn't refresh data
Root Cause: Empty callback function
Pattern: Always verify callbacks have implementations
Confidence: 0.95 → Proposals: debugger, prd-implementation-precheck

Skill Update

## Proposed Update (2025-01-11)

### Pattern Added
**Callback Verification**: Always verify that callback functions
passed as props are not empty and actually execute logic.

**Source**: Episode ep-2025-01-11-003 (3 occurrences)
**Action**: Propose adding to debugger checklist

Research Basis

Templates

Reusable templates live in skills/self-improving-agent/templates:

pattern-template.md
correction-template.md
validation-template.md

License

MIT

Appendix

Self-Validation

Validation Report Template

## Validation Report Template

**Date**: [YYYY-MM-DD]
**Scope**: [skill(s) validated]

### Checks
- [ ] Examples compile or run
- [ ] Checklists match current repo conventions
- [ ] External references still valid
- [ ] No duplicated or conflicting guidance

### Findings
- [Finding 1]
- [Finding 2]

### Actions
- [Action 1]
- [Action 2]

Memory File Structure

~/.claude/memory/
├── semantic/
│   └── patterns.json
├── episodic/
│   ├── 2025/
│   │   ├── 2025-01-11-prd-creation.json
│   │   └── 2025-01-11-debug-session.json
│   └── episodes.json
├── working/
│   ├── current_session.json
│   ├── last_error.json
│   └── session_end.json
└── index.json

Automatic Workflow Integration

Any Skill Run
  -> workflow-orchestrator
    -> self-improving-agent (background)
    -> create-pr (ask_first)
    -> session-logger (auto)

Continuous Learning Metrics

{
  "metrics": {
    "patterns_learned": 47,
    "patterns_applied": 238,
    "skills_updated": 12,
    "avg_confidence": 0.87,
    "user_satisfaction_trend": "improving",
    "error_rate_reduction": "-35%",
    "self_corrections": 8
  }
}

Human-in-the-Loop

Feedback Collection

## Self-Improvement Summary

I've learned from our session and updated:

### Updated Skills
- `debugger`: Added callback verification pattern
- `prd-planner`: Enhanced UI/UX specification requirements

### Patterns Extracted
1. **state_monitoring_over_callbacks**: Use usePrevious for state-driven side effects
2. **ui_ux_specification_granularity**: Explicit visual specs prevent rework

### Confidence Levels
- New patterns: 0.85 (needs validation)
- Reinforced patterns: 0.95 (well-established)

### Your Feedback
Rate these improvements (1-10):
- Were the updates helpful?
- Should I apply this pattern more broadly?
- Any corrections needed?

Feedback Integration

User Feedback:
  positive (rating >= 7):
    action: Increase pattern confidence
    scope: Expand to related skills

  neutral (rating 4-6):
    action: Keep pattern, gather more data
    scope: Current skill only

  negative (rating <= 3):
    action: Decrease confidence, revise pattern
    scope: Remove from active patterns

Templates

Template	Purpose
`templates/pattern-template.md`	Adding new patterns
`templates/correction-template.md`	Fixing incorrect guidance
`templates/validation-template.md`	Validating skill accuracy

References

Related skills

Setup Matt Pocock SkillsScaffold the per-repo configuration that Matt Pocock’s engineering agent skills rely on so they understand the issue tracker, triage labels, and domain documentation la462k185k

Lark Skill MakerQuickly turn any Lark/Feishu OpenAPI call or multi-step workflow into a reusable agent skill with its own SKILL.md.379k15.8k

CavemanSlash token usage by roughly 75% while keeping every technical detail intact when working with Claude Code, Cursor or similar agents.378k92.5k

Lark AppsConnect Claude, Cursor or custom agents directly to Lark (Feishu) for messaging, document automation, approval workflows and enterprise data access.375k

Running Claude Code Via Litellm CopilotRun Claude Code at a fraction of the cost by routing requests through LiteLLM to the GitHub Copilot Chat API.270k72

Codex PetGenerate a complete Codex Pet spritesheet and metadata from one reference image without needing an OpenAI key or Codex Pro.246k8

How it compares

Use self-improving-agent for hook-driven agent memory; use static SKILL.md-only guidance when you do not run PreToolUse or PostToolUse pipelines.

FAQ

What hooks does self-improving-agent provide?

self-improving-agent ships bash hooks for PreToolUse (tool name and input), PostToolUse (exit code and output), and session end. Each writes structured lines to stderr for downstream capture.

How does self-improving-agent store learnings?

self-improving-agent maintains a patterns JSON file with named entries such as prd_document_separation and dated pattern IDs, letting future sessions reuse documented agent decisions.

Is Self Improving Agent safe to install?

skills.sh reports 1 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

AI & Agent Buildingagentsautomation

About

Self Improving Agent by the numbers

What self-improving-agent says it does

Add your badge

What it does

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

Self-Improving Agent

Overview

Research-Based Design

The Self-Improvement Loop

When This Activates

Automatic Triggers (via hooks)

Manual Triggers

Evolution Priority Matrix

Multi-Memory Architecture

1. Semantic Memory (memory/semantic-patterns.json)

2. Episodic Memory (memory/episodic/)

3. Working Memory (memory/working/)

Self-Improvement Process

Phase 1: Experience Extraction

Phase 2: Pattern Abstraction

Phase 3: Skill Updates

Phase 4: Memory Consolidation

Promotion Policy

Self-Correction (on_error hook)

Self-Validation

Hooks Integration

Runtime Trigger Source

Wiring Hooks in Claude Code Settings

Additional References

Best Practices

DO

DON'T

Quick Start

References

Self-Improving Agent

Overview

Key Features

Memory System

How It Works

Installation

Hooks (Optional)

Triggering

Host-Supported Follow-up

Manual

Example Learning

Episode

Skill Update

Research Basis

Templates

License

Appendix

Self-Validation

Validation Report Template

Memory File Structure

Automatic Workflow Integration

Continuous Learning Metrics

Human-in-the-Loop

Feedback Collection

Feedback Integration

Templates

References

Correction Template

Issue Summary

Previous Guidance

Corrected Guidance

Root Cause

Follow-up Actions

Pattern Template

Pattern Name

Context

Guidance

Examples

Confidence

Validation Template

Date

1. Semantic Memory (`memory/semantic-patterns.json`)

2. Episodic Memory (`memory/episodic/`)

3. Working Memory (`memory/working/`)