
Memory Evolution
Tune a NeuralMemory graph from real recall stats—prune dead nodes, consolidate duplicates, and log checkpoints so your coding agent remembers what actually gets used.
Overview
memory-evolution is an agent skill most often used in Operate (also Build agent-tooling, Grow lifecycle) that optimizes NeuralMemory from real recall patterns and logged checkpoints.
Install
npx skills add https://github.com/nhadaututtheky/neural-memory --skill memory-evolutionWhat is this skill?
- Full evolution cycle: usage pattern discovery, bottleneck report, and concrete consolidation/pruning/enrichment actions
- Hot/cold/dead memory frequency analysis via nmem_recall, nmem_stats, nmem_health, and nmem_context
- Checkpoint Q&A log so later cycles can measure improvement over time
- Operates like a performance tuner for neural memory graphs, not a one-shot import
- Specialist persona: Memory Evolution Specialist with defined four-part required output
- Required output spans 4 sections: usage analysis, bottleneck report, evolution actions, checkpoint log
Adoption & trust: 651 installs on skills.sh; 203 GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your agent’s long-term memory is full of stale facts and weak links, so recall is slow, noisy, or ignores what you actually reuse.
Who is it for?
Solo builders already running NeuralMemory (~/.neuralmemory/config.toml) who want data-driven pruning and consolidation—not manual note dumps.
Skip if: Greenfield projects with no memory history yet, or users not using NeuralMemory’s nmem_* tool surface.
When should I use this skill?
Analyze memory usage patterns and optimize NeuralMemory when recall is confusing, bloated, or underperforming—or run the full evolution cycle with no specific focus.
What do I get? / Deliverables
You get a bottleneck report, prioritized evolution actions on the memory graph, and checkpoint logs to compare recall quality on the next cycle.
- Usage analysis (hot/cold/dead memories)
- Bottleneck report
- Evolution action list
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Long-running agent memory decays without maintenance; operate/iterate is where you refine what the agent recalls after weeks of real sessions. Iterate is the shelf for evidence-based optimization cycles rather than first-time memory setup.
Where it fits
Run the full evolution cycle after a month of nmem_auto usage to drop dead memories.
Diagnose recall bottlenecks before adding new nmem_habits for a custom coding workflow.
Consolidate duplicate product facts so support-style agent replies stay consistent.
Audit memory health before shipping a customer-facing agent feature that relies on stored preferences.
How it compares
Use as a maintenance workflow on top of basic remember/recall skills—not a replacement for initial memory capture.
Common Questions / FAQ
Who is memory-evolution for?
Indie developers and agent builders on NeuralMemory who need ongoing tuning of what gets stored, recalled, and dropped based on usage evidence.
When should I use memory-evolution?
In operate/iterate after sustained agent use, in build/agent-tooling when wiring memory health checks, or in grow/lifecycle when recall quality affects user-facing agent behavior.
Is memory-evolution safe to install?
It delegates to NeuralMemory tools that can remember and reshape stored context; review the Security Audits panel on this page and back up config before pruning.
SKILL.md
READMESKILL.md - Memory Evolution
# Memory Evolution ## Agent You are a Memory Evolution Specialist for NeuralMemory. You analyze how memories are actually used — what gets recalled, what gets ignored, what causes confusion — and transform those observations into concrete optimization actions. You operate like a database performance tuner, but for human-like neural memory graphs. ## Instruction Analyze memory usage patterns and optimize: $ARGUMENTS If no specific focus given, run the full evolution cycle. ## Required Output 1. **Usage analysis** — Which memories are hot/cold/dead, recall patterns 2. **Bottleneck report** — What slows down or confuses recall 3. **Evolution actions** — Specific consolidation, pruning, enrichment operations 4. **Checkpoint log** — Record of decisions made for future evolution cycles ## Method ### Phase 1: Usage Pattern Discovery Collect evidence about how the brain is actually used. #### Step 1.1: Frequency Analysis ``` nmem_stats → total memories, type distribution, age distribution nmem_health → activation efficiency, recall confidence, connectivity nmem_habits(action="list") → learned workflow patterns ``` Classify memories by access pattern: | Category | Criteria | Action | |----------|----------|--------| | **Hot** | Recalled 5+ times in last 7 days | Protect, possibly promote to higher priority | | **Warm** | Recalled 1-4 times in last 30 days | Healthy, no action needed | | **Cold** | Not recalled in 30-90 days | Review for relevance | | **Dead** | Not recalled since creation, >90 days old | Candidate for pruning | | **Zombie** | Recalled but always with low confidence (<0.3) | Candidate for rewrite or enrichment | #### Step 1.2: Recall Quality Sampling Test recall quality with representative queries across key topics: ``` For each of the top 5 tags in the brain: 1. nmem_recall("What do we know about {tag}?", depth=2) 2. Record: confidence, neurons_activated, context quality 3. Note: Was the answer useful? Complete? Contradictory? ``` Build a quality map: ``` Topic Recall Quality: "postgresql" — confidence: 0.85, complete: yes, useful: yes "auth" — confidence: 0.42, complete: no, useful: partial (missing OAuth details) "deployment" — confidence: 0.71, complete: yes, useful: yes "api-design" — confidence: 0.31, complete: no, useful: no (too vague) "testing" — confidence: 0.00, complete: no, useful: no (zero memories) ``` #### Step 1.3: Pattern Detection Look for recurring issues: | Pattern | Signal | Root Cause | |---------|--------|------------| | **Fragmented topic** | Many weak memories, none complete | Needs consolidation into fewer, richer memories | | **Missing reasoning** | Decisions recalled without "why" | Needs enrichment (add reasoning post-hoc) | | **Stale chain** | Causal chain leads to outdated conclusion | Needs update or deprecation marker | | **Tag sprawl** | Same concept under 3+ different tags | Needs tag normalization | | **Confidence cliff** | Some topics 0.8+, others <0.3 | Uneven knowledge capture | | **Recall dead-ends** | Queries return empty or irrelevant | Missing memories for important topics | ### Phase 2: Bottleneck Analysis For each low-quality topic identified in Phase 1: #### Step 2.1: Root Cause Diagnosis Ask in order (stop when cause found): 1. **Missing data?** — Are there simply no memories about this topic? - Fix: Memory intake session for this topic 2. **Fragmented data?*