
V3 Performance Optimization
Benchmark and tune claude-flow v3 so Flash Attention, AgentDB search, and memory usage hit documented speedup targets before you rely on it in production agents.
Overview
V3 Performance Optimization is an agent skill most often used in Ship (also Build agent-tooling, Operate infra) that benchmarks claude-flow v3 against Flash Attention, HNSW search, and memory-reduction targets.
Install
npx skills add https://github.com/ruvnet/ruflo --skill v3-performance-optimizationWhat is this skill?
- Validates Flash Attention against a 2.49x–7.47x speedup target with 50–75% memory reduction and sub-millisecond latency
- Benchmarks AgentDB HNSW search for 150x–12,500x improvement versus baseline search paths.
- Runs parallel Task workflows for baseline, Flash Attention, search optimization, and memory optimization tracks.
- Provides a performance target matrix for continuous benchmarking across attention, search, and system optimization.
- Oriented to claude-flow v3 industry-leading performance validation, not one-off micro-optimizations.
- 2.49x–7.47x Flash Attention speedup target
- 150x–12,500x search improvement target
- 50–75% memory reduction target
Adoption & trust: 623 installs on skills.sh; 58.5k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You upgraded to claude-flow v3 but have no reproducible baselines to confirm attention speedups, search latency, or memory wins.
Who is it for?
Agent builders optimizing claude-flow v3 stacks who need parallel benchmark tasks and explicit numeric targets.
Skip if: Simple CRUD apps with no agent runtime, or teams unwilling to run comparative baselines before tuning.
When should I use this skill?
Validating or optimizing claude-flow v3 performance for Flash Attention, AgentDB search, and memory after integration changes.
What do I get? / Deliverables
You get documented benchmark runs that map measured results to the skill’s speedup and memory matrices so you can ship or roll back v3 changes confidently.
- Performance baseline report
- Target-matrix validation results for attention, search, and memory
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Performance target validation and regression benchmarking belong on Ship under performance because you prove latency and memory budgets before scaling agent workloads. Perf is the canonical shelf for speedup matrices and memory reduction goals; agent-tooling build steps feed the same benchmarks but do not replace ship-time proof.
Where it fits
After wiring AgentDB HNSW, run search optimization tasks to confirm index builds meet the 150x–12,500x improvement band.
Gate a release on Flash Attention benchmarks showing movement toward the 2.49x–7.47x target versus your v2 baseline.
Re-run memory optimization tasks after scaling worker count to verify 50–75% memory reduction holds under load.
How it compares
Use for v3-specific performance matrices instead of generic “make it faster” code-review prompts with no benchmark harness.
Common Questions / FAQ
Who is v3 performance optimization for?
Solo developers and small teams operating claude-flow v3 who need structured validation of attention, search, and memory optimizations.
When should I use v3 performance optimization?
In Ship perf passes before production agent load, during Build agent-tooling when integrating Flash Attention or AgentDB, and in Operate when re-benchmarking after infra changes.
Is v3 performance optimization safe to install?
Check the Security Audits panel on this page and limit shell or network permissions while benchmark tasks touch local models, indexes, and telemetry.
SKILL.md
READMESKILL.md - V3 Performance Optimization
# V3 Performance Optimization ## What This Skill Does Validates and optimizes claude-flow v3 to achieve industry-leading performance through Flash Attention, AgentDB HNSW indexing, and comprehensive system optimization with continuous benchmarking. ## Quick Start ```bash # Initialize performance optimization Task("Performance baseline", "Establish v2 performance benchmarks", "v3-performance-engineer") # Target validation (parallel) Task("Flash Attention", "Validate 2.49x-7.47x speedup target", "v3-performance-engineer") Task("Search optimization", "Validate 150x-12,500x search improvement", "v3-performance-engineer") Task("Memory optimization", "Achieve 50-75% memory reduction", "v3-performance-engineer") ``` ## Performance Target Matrix ### Flash Attention Revolution ``` ┌─────────────────────────────────────────┐ │ FLASH ATTENTION │ ├─────────────────────────────────────────┤ │ Baseline: Standard attention │ │ Target: 2.49x - 7.47x speedup │ │ Memory: 50-75% reduction │ │ Latency: Sub-millisecond processing │ └─────────────────────────────────────────┘ ``` ### Search Performance Revolution ``` ┌─────────────────────────────────────────┐ │ SEARCH OPTIMIZATION │ ├─────────────────────────────────────────┤ │ Current: O(n) linear search │ │ Target: 150x - 12,500x improvement │ │ Method: HNSW indexing │ │ Latency: <100ms for 1M+ entries │ └─────────────────────────────────────────┘ ``` ## Comprehensive Benchmark Suite ### Startup Performance ```typescript class StartupBenchmarks { async benchmarkColdStart(): Promise<BenchmarkResult> { const startTime = performance.now(); await this.initializeCLI(); await this.initializeMCPServer(); await this.spawnTestAgent(); const totalTime = performance.now() - startTime; return { total: totalTime, target: 500, // ms achieved: totalTime < 500 }; } } ``` ### Memory Operation Benchmarks ```typescript class MemoryBenchmarks { async benchmarkVectorSearch(): Promise<SearchBenchmark> { const queries = this.generateTestQueries(10000); // Baseline: Current linear search const baselineTime = await this.timeOperation(() => this.currentMemory.searchAll(queries) ); // Target: HNSW search const hnswTime = await this.timeOperation(() => this.agentDBMemory.hnswSearchAll(queries) ); const improvement = baselineTime / hnswTime; return { baseline: baselineTime, hnsw: hnswTime, improvement, targetRange: [150, 12500], achieved: improvement >= 150 }; } async benchmarkMemoryUsage(): Promise<MemoryBenchmark> { const baseline = process.memoryUsage().heapUsed; await this.loadTestDataset(); const withData = process.memoryUsage().heapUsed; await this.enableOptimization(); const optimized = process.memoryUsage().heapUsed; const reduction = (withData - optimized) / withData; return { baseline, withData, optimized, reductionPercent: reduction * 100, targetReduction: [50, 75], achieved: reduction >= 0.5 }; } } ``` ### Swarm Coordination Benchmarks ```typescript class SwarmBenchmarks { async benchmark15AgentCoordination(): Promise<SwarmBenchmark> { const agents = await this.spawn15Agents(); // Coordination latency const coordinationTime = await this.timeOperation(() => this.coordinateSwarmTask(agents) ); // Task decomposition const decompositionTime = await this.timeOperation(() => this.decomposeComplexTask() ); // Consensus achievement const consensusTime = await this.timeOper