V3 Performance Optimization

Name: V3 Performance Optimization
Author: ruvnet

ruvnet/ruflo

Benchmark and tune claude-flow v3 so Flash Attention, AgentDB search, and memory usage hit documented speedup targets before you rely on it in production agents.

Overview

V3 Performance Optimization is an agent skill most often used in Ship (also Build agent-tooling, Operate infra) that benchmarks claude-flow v3 against Flash Attention, HNSW search, and memory-reduction targets.

Install

npx skills add https://github.com/ruvnet/ruflo --skill v3-performance-optimization

What is this skill?

Validates Flash Attention against a 2.49x–7.47x speedup target with 50–75% memory reduction and sub-millisecond latency
Benchmarks AgentDB HNSW search for 150x–12,500x improvement versus baseline search paths.
Runs parallel Task workflows for baseline, Flash Attention, search optimization, and memory optimization tracks.
Provides a performance target matrix for continuous benchmarking across attention, search, and system optimization.
Oriented to claude-flow v3 industry-leading performance validation, not one-off micro-optimizations.
2.49x–7.47x Flash Attention speedup target
150x–12,500x search improvement target
50–75% memory reduction target

Compatible agents: Claude Code, Codex, any compatible agent

Adoption & trust: 623 installs on skills.sh; 58.5k GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You upgraded to claude-flow v3 but have no reproducible baselines to confirm attention speedups, search latency, or memory wins.

Who is it for?

Agent builders optimizing claude-flow v3 stacks who need parallel benchmark tasks and explicit numeric targets.

Skip if: Simple CRUD apps with no agent runtime, or teams unwilling to run comparative baselines before tuning.

When should I use this skill?

Validating or optimizing claude-flow v3 performance for Flash Attention, AgentDB search, and memory after integration changes.

What do I get? / Deliverables

You get documented benchmark runs that map measured results to the skill’s speedup and memory matrices so you can ship or roll back v3 changes confidently.

Performance baseline report
Target-matrix validation results for attention, search, and memory

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Performance target validation and regression benchmarking belong on Ship under performance because you prove latency and memory budgets before scaling agent workloads. Perf is the canonical shelf for speedup matrices and memory reduction goals; agent-tooling build steps feed the same benchmarks but do not replace ship-time proof.

Also useful

BuildAgent skills & templates

Also useful

OperateInfrastructure & cost

Where it fits

Example use

BuildAgent skills & templates

After wiring AgentDB HNSW, run search optimization tasks to confirm index builds meet the 150x–12,500x improvement band.

Example use

ShipPerformance

Gate a release on Flash Attention benchmarks showing movement toward the 2.49x–7.47x target versus your v2 baseline.

Example use

OperateInfrastructure & cost

Re-run memory optimization tasks after scaling worker count to verify 50–75% memory reduction holds under load.

How it compares

Use for v3-specific performance matrices instead of generic “make it faster” code-review prompts with no benchmark harness.

Common Questions / FAQ

Who is v3 performance optimization for?

Solo developers and small teams operating claude-flow v3 who need structured validation of attention, search, and memory optimizations.

When should I use v3 performance optimization?

In Ship perf passes before production agent load, during Build agent-tooling when integrating Flash Attention or AgentDB, and in Operate when re-benchmarking after infra changes.

Is v3 performance optimization safe to install?

Check the Security Audits panel on this page and limit shell or network permissions while benchmark tasks touch local models, indexes, and telemetry.

SKILL.md

READMESKILL.md - V3 Performance Optimization

# V3 Performance Optimization

## What This Skill Does

Validates and optimizes claude-flow v3 to achieve industry-leading performance through Flash Attention, AgentDB HNSW indexing, and comprehensive system optimization with continuous benchmarking.

## Quick Start

```bash
# Initialize performance optimization
Task("Performance baseline", "Establish v2 performance benchmarks", "v3-performance-engineer")

# Target validation (parallel)
Task("Flash Attention", "Validate 2.49x-7.47x speedup target", "v3-performance-engineer")
Task("Search optimization", "Validate 150x-12,500x search improvement", "v3-performance-engineer")
Task("Memory optimization", "Achieve 50-75% memory reduction", "v3-performance-engineer")
```

## Performance Target Matrix

### Flash Attention Revolution
```
┌─────────────────────────────────────────┐
│           FLASH ATTENTION               │
├─────────────────────────────────────────┤
│  Baseline: Standard attention           │
│  Target:   2.49x - 7.47x speedup       │
│  Memory:   50-75% reduction             │
│  Latency:  Sub-millisecond processing   │
└─────────────────────────────────────────┘
```

### Search Performance Revolution
```
┌─────────────────────────────────────────┐
│            SEARCH OPTIMIZATION         │
├─────────────────────────────────────────┤
│  Current:  O(n) linear search           │
│  Target:   150x - 12,500x improvement   │
│  Method:   HNSW indexing                │
│  Latency:  <100ms for 1M+ entries       │
└─────────────────────────────────────────┘
```

## Comprehensive Benchmark Suite

### Startup Performance
```typescript
class StartupBenchmarks {
  async benchmarkColdStart(): Promise<BenchmarkResult> {
    const startTime = performance.now();

    await this.initializeCLI();
    await this.initializeMCPServer();
    await this.spawnTestAgent();

    const totalTime = performance.now() - startTime;

    return {
      total: totalTime,
      target: 500, // ms
      achieved: totalTime < 500
    };
  }
}
```

### Memory Operation Benchmarks
```typescript
class MemoryBenchmarks {
  async benchmarkVectorSearch(): Promise<SearchBenchmark> {
    const queries = this.generateTestQueries(10000);

    // Baseline: Current linear search
    const baselineTime = await this.timeOperation(() =>
      this.currentMemory.searchAll(queries)
    );

    // Target: HNSW search
    const hnswTime = await this.timeOperation(() =>
      this.agentDBMemory.hnswSearchAll(queries)
    );

    const improvement = baselineTime / hnswTime;

    return {
      baseline: baselineTime,
      hnsw: hnswTime,
      improvement,
      targetRange: [150, 12500],
      achieved: improvement >= 150
    };
  }

  async benchmarkMemoryUsage(): Promise<MemoryBenchmark> {
    const baseline = process.memoryUsage().heapUsed;

    await this.loadTestDataset();
    const withData = process.memoryUsage().heapUsed;

    await this.enableOptimization();
    const optimized = process.memoryUsage().heapUsed;

    const reduction = (withData - optimized) / withData;

    return {
      baseline,
      withData,
      optimized,
      reductionPercent: reduction * 100,
      targetReduction: [50, 75],
      achieved: reduction >= 0.5
    };
  }
}
```

### Swarm Coordination Benchmarks
```typescript
class SwarmBenchmarks {
  async benchmark15AgentCoordination(): Promise<SwarmBenchmark> {
    const agents = await this.spawn15Agents();

    // Coordination latency
    const coordinationTime = await this.timeOperation(() =>
      this.coordinateSwarmTask(agents)
    );

    // Task decomposition
    const decompositionTime = await this.timeOperation(() =>
      this.decomposeComplexTask()
    );

    // Consensus achievement
    const consensusTime = await this.timeOper

What is this skill?

Validates Flash Attention against a 2.49x–7.47x speedup target with 50–75% memory reduction and sub-millisecond latency

Benchmarks AgentDB HNSW search for 150x–12,500x improvement versus baseline search paths.

Runs parallel Task workflows for baseline, Flash Attention, search optimization, and memory optimization tracks.

Provides a performance target matrix for continuous benchmarking across attention, search, and system optimization.

Oriented to claude-flow v3 industry-leading performance validation, not one-off micro-optimizations.

2.49x–7.47x Flash Attention speedup target

150x–12,500x search improvement target

50–75% memory reduction target

Compatible agents: Claude Code, Codex, any compatible agent

Adoption & trust: 623 installs on skills.sh; 58.5k GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Also useful

BuildAgent skills & templates

Also useful

OperateInfrastructure & cost

Where it fits

Example use

BuildAgent skills & templates

After wiring AgentDB HNSW, run search optimization tasks to confirm index builds meet the 150x–12,500x improvement band.

Example use

ShipPerformance

Gate a release on Flash Attention benchmarks showing movement toward the 2.49x–7.47x target versus your v2 baseline.

Example use

OperateInfrastructure & cost

Re-run memory optimization tasks after scaling worker count to verify 50–75% memory reduction holds under load.

SKILL.md

READMESKILL.md - V3 Performance Optimization

# V3 Performance Optimization

## What This Skill Does

Validates and optimizes claude-flow v3 to achieve industry-leading performance through Flash Attention, AgentDB HNSW indexing, and comprehensive system optimization with continuous benchmarking.

## Quick Start

```bash
# Initialize performance optimization
Task("Performance baseline", "Establish v2 performance benchmarks", "v3-performance-engineer")

# Target validation (parallel)
Task("Flash Attention", "Validate 2.49x-7.47x speedup target", "v3-performance-engineer")
Task("Search optimization", "Validate 150x-12,500x search improvement", "v3-performance-engineer")
Task("Memory optimization", "Achieve 50-75% memory reduction", "v3-performance-engineer")
```

## Performance Target Matrix

### Flash Attention Revolution
```
┌─────────────────────────────────────────┐
│           FLASH ATTENTION               │
├─────────────────────────────────────────┤
│  Baseline: Standard attention           │
│  Target:   2.49x - 7.47x speedup       │
│  Memory:   50-75% reduction             │
│  Latency:  Sub-millisecond processing   │
└─────────────────────────────────────────┘
```

### Search Performance Revolution
```
┌─────────────────────────────────────────┐
│            SEARCH OPTIMIZATION         │
├─────────────────────────────────────────┤
│  Current:  O(n) linear search           │
│  Target:   150x - 12,500x improvement   │
│  Method:   HNSW indexing                │
│  Latency:  <100ms for 1M+ entries       │
└─────────────────────────────────────────┘
```

## Comprehensive Benchmark Suite

### Startup Performance
```typescript
class StartupBenchmarks {
  async benchmarkColdStart(): Promise<BenchmarkResult> {
    const startTime = performance.now();

    await this.initializeCLI();
    await this.initializeMCPServer();
    await this.spawnTestAgent();

    const totalTime = performance.now() - startTime;

    return {
      total: totalTime,
      target: 500, // ms
      achieved: totalTime < 500
    };
  }
}
```

### Memory Operation Benchmarks
```typescript
class MemoryBenchmarks {
  async benchmarkVectorSearch(): Promise<SearchBenchmark> {
    const queries = this.generateTestQueries(10000);

    // Baseline: Current linear search
    const baselineTime = await this.timeOperation(() =>
      this.currentMemory.searchAll(queries)
    );

    // Target: HNSW search
    const hnswTime = await this.timeOperation(() =>
      this.agentDBMemory.hnswSearchAll(queries)
    );

    const improvement = baselineTime / hnswTime;

    return {
      baseline: baselineTime,
      hnsw: hnswTime,
      improvement,
      targetRange: [150, 12500],
      achieved: improvement >= 150
    };
  }

  async benchmarkMemoryUsage(): Promise<MemoryBenchmark> {
    const baseline = process.memoryUsage().heapUsed;

    await this.loadTestDataset();
    const withData = process.memoryUsage().heapUsed;

    await this.enableOptimization();
    const optimized = process.memoryUsage().heapUsed;

    const reduction = (withData - optimized) / withData;

    return {
      baseline,
      withData,
      optimized,
      reductionPercent: reduction * 100,
      targetReduction: [50, 75],
      achieved: reduction >= 0.5
    };
  }
}
```

### Swarm Coordination Benchmarks
```typescript
class SwarmBenchmarks {
  async benchmark15AgentCoordination(): Promise<SwarmBenchmark> {
    const agents = await this.spawn15Agents();

    // Coordination latency
    const coordinationTime = await this.timeOperation(() =>
      this.coordinateSwarmTask(agents)
    );

    // Task decomposition
    const decompositionTime = await this.timeOperation(() =>
      this.decomposeComplexTask()
    );

    // Consensus achievement
    const consensusTime = await this.timeOper

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is v3 performance optimization for?

When should I use v3 performance optimization?

Is v3 performance optimization safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is v3 performance optimization for?

When should I use v3 performance optimization?

Is v3 performance optimization safe to install?

SKILL.md