Caveman Token Optimizer

Name: Caveman Token Optimizer
Author: aradotso

aradotso/trending-skills

792 installs
66 repo stars
Updated July 9, 2026
aradotso/trending-skills

caveman-token-optimizer is a Claude Code skill that makes AI agents reply in compressed caveman-speak, cutting roughly 65% of output tokens on average while keeping technical answers accurate.

About

caveman-token-optimizer is an agent-behavior skill from aradotso/trending-skills that rewrites Claude Code and Codex replies into terse caveman-speak, reporting about 65% average output token reduction and up to 87% in favorable cases while preserving technical precision. Install triggers include phrases like "install caveman mode" and "reduce token usage caveman." Developers reach for caveman-token-optimizer when long agent transcripts inflate API bills or slow loops, but they still need correct commands, code, and debugging guidance in responses.

Cuts 65-75% of output tokens on average (up to 87%)
Removes all pleasantries, hedging, articles and verbose transitions
Preserves full code blocks, technical terms, error messages and git/PR text
One-command install via skills.sh or Claude plugin marketplace
Works in any coding, debugging or planning session

Caveman Token Optimizer by the numbers

792 all-time installs (skills.sh)
+16 installs in the week ending Jul 25, 2026 (Skillselion tracking)
Ranked #1,314 of 16,659 AI & Agent Building skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 25, 2026 (Skillselion catalog sync)

npx skills add https://github.com/aradotso/trending-skills --skill caveman-token-optimizer

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/aradotso/trending-skills/caveman-token-optimizer.svg)](https://skillselion.com/skills/aradotso/trending-skills/caveman-token-optimizer)

Installs	792
repo stars	★ 66
Security audit	2 / 3 scanners passed
Last updated	July 9, 2026
Repository	aradotso/trending-skills ↗

How do you reduce Claude Code output token usage?

Slash token usage and speed up every Claude Code or Codex conversation without losing technical precision.

Who is it for?

Developers running high-volume Claude Code or Codex sessions who need shorter answers without sacrificing technical correctness.

Skip if: Stakeholder-facing prose, documentation drafts, or teams requiring formal polished language in every agent response.

When should I use this skill?

A developer asks to cut token usage, enable caveman mode, or speed up verbose Claude Code conversations.

What you get

Caveman-compressed agent replies with lower token counts per turn

caveman-mode agent response style
reduced per-turn output token volume

By the numbers

Reports ~65% average output token reduction
Reports up to 87% token reduction in favorable cases

Files

SKILL.mdMarkdownGitHub ↗

Caveman Token Optimizer

Skill by ara.so — Daily 2026 Skills collection.

A Claude Code skill and Codex plugin that makes AI agents respond in compressed caveman-speak — cutting ~65% of output tokens on average (up to 87%) while keeping full technical accuracy. No pleasantries. No filler. Just answer.

What It Does

Caveman mode strips:

Pleasantries: "Sure, I'd be happy to help!" → gone
Hedging: "It might be worth considering" → gone
Articles (a, an, the) → gone
Verbose transitions → gone

Caveman keeps:

All code blocks (written normally)
Technical terms (exact: useMemo, polymorphism, middleware)
Error messages (quoted exactly)
Git commits and PR descriptions (normal)

Same fix. 75% less word. Brain still big.

Install

Claude Code (npx)

npx skills add JuliusBrussee/caveman

Claude Code (plugin system)

claude plugin marketplace add JuliusBrussee/caveman
claude plugin install caveman@caveman

Codex

1. Clone the repo 2. Open Codex inside the repo 3. Run /plugins 4. Search Caveman 5. Install plugin

Install once. Works in all sessions after that.

Manual / Local

git clone https://github.com/JuliusBrussee/caveman.git
cd caveman
pip install -e .

Usage — Trigger Commands

Claude Code

/caveman          # enable default (full) caveman mode
/caveman lite     # professional brevity, grammar intact
/caveman full     # default — drop articles, use fragments
/caveman ultra    # maximum compression, telegraphic

Codex

$caveman
$caveman lite
$caveman full
$caveman ultra

Natural language triggers

Any of these phrases activate caveman mode:

"talk like caveman"
"caveman mode"
"less tokens please"
"be concise"

Disable

/caveman off
# or say: "stop caveman" / "normal mode"

Level sticks until changed or session ends.

Intensity Levels

Level	Trigger	Style	Example
Lite	`/caveman lite`	Drop filler, keep grammar	"Component re-renders because inline object prop creates new reference each cycle. Wrap in `useMemo`."
Full	`/caveman full`	Drop articles, use fragments	"New object ref each render. Inline prop = new ref = re-render. Wrap in `useMemo`."
Ultra	`/caveman ultra`	Telegraphic, abbreviate everything	"Inline obj prop → new ref → re-render. `useMemo`."

Benchmark Results

Real token counts from Claude API (reproducible via benchmarks/ directory):

Task	Normal	Caveman	Saved
React re-render bug	1180	159	87%
Auth middleware fix	704	121	83%
PostgreSQL pool setup	2347	380	84%
Git rebase vs merge	702	292	58%
Async/await refactor	387	301	22%
Docker multi-stage build	1042	290	72%
Average	1214	294	65%

Important: Caveman only affects output tokens. Thinking/reasoning tokens are untouched. Caveman make mouth smaller, not brain.

Reproducing Benchmarks

git clone https://github.com/JuliusBrussee/caveman.git
cd caveman/benchmarks

# Set your Anthropic API key
export ANTHROPIC_API_KEY=your_key_here

# Run benchmark suite
python run_benchmarks.py

# Compare normal vs caveman responses
python compare.py --task react-rerender
python compare.py --task auth-middleware
python compare.py --all

Code Examples — What Caveman Mode Changes

Before (normal, 69 tokens)

The reason your React component is re-rendering is likely because
you're creating a new object reference on each render cycle. When
you pass an inline object as a prop, React's shallow comparison
sees it as a different object every time, which triggers a
re-render. I'd recommend using useMemo to memoize the object.

After (caveman full, 19 tokens)

New object ref each render. Inline object prop = new ref = re-render.
Wrap in `useMemo`.

Code blocks stay normal — caveman not stupid

# Caveman explains in grunt, but code stays clean:
# "Token expiry check broken. Fix:"

def verify_token(token: str) -> bool:
    payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
    # Was: payload["exp"] < time.time()
    # Fix:
    return payload["exp"] >= time.time()

What Caveman Preserves vs. Removes

# Tokens caveman REMOVES (waste):
filler_phrases = [
    "I'd be happy to help you with that",   # 8 tokens gone
    "The reason this is happening is because", # 7 tokens gone
    "I would recommend that you consider",   # 7 tokens gone
    "Sure, let me take a look at that",      # 8 tokens gone
    "Great question!",                        # 2 tokens gone
    "Certainly!",                             # 1 token gone
]

# Things caveman KEEPS (substance):
preserved = [
    "code blocks",          # always normal
    "technical_terms",      # exact spelling preserved
    "error_messages",       # quoted verbatim
    "variable_names",       # exact
    "git_commits",          # normal prose
    "pr_descriptions",      # normal prose
]

Integration Pattern — Using in a Project

If you want caveman-style compression in your own Claude API calls:

import anthropic

client = anthropic.Anthropic()  # uses ANTHROPIC_API_KEY env var

# Load the caveman SKILL.md as a system prompt addition
with open("path/to/caveman/SKILL.md", "r") as f:
    caveman_skill = f.read()

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    system=f"{caveman_skill}\n\nRespond in caveman mode: full intensity.",
    messages=[
        {"role": "user", "content": "Why is my React component re-rendering?"}
    ]
)

print(response.content[0].text)
# → "New object ref each render. Inline prop = new ref = re-render. useMemo fix."
print(f"Tokens used: {response.usage.output_tokens}")  # ~19 vs ~69

Session Workflow

# Start session with caveman
/caveman full

# Ask technical questions normally — agent responds in caveman
> Why does my Docker build take so long?
→ "Layer cache miss. COPY before RUN npm install. Fix order:"
[code block shown normally]

# Switch intensity mid-session
/caveman lite

# Turn off for PR description writing
/caveman off
> Write a PR description for this auth fix
→ [normal, professional prose]

# Back to caveman
/caveman

Troubleshooting

Caveman mode not activating:

# Verify plugin installed
claude plugin list | grep caveman

# Reinstall
claude plugin remove caveman
claude plugin install caveman@caveman

Savings lower than expected:

Caveman only compresses output tokens — input tokens unchanged
Tasks with heavy code output (like Docker setup) see less savings since code is preserved verbatim
Reasoning/thinking tokens not affected — savings show in visible response only
Ultra mode gets maximum compression; switch if full mode feels verbose

Need normal mode for specific output:

/caveman off   # for PR descriptions, user-facing docs, formal reports
/caveman       # re-enable after

Benchmarking your own tasks:

cd benchmarks/
export ANTHROPIC_API_KEY=your_key_here
python run_benchmarks.py --task "your custom task description"

Why It Works

Backed by a March 2026 paper "Brevity Constraints Reverse Performance Hierarchies in Language Models": constraining large models to brief responses improved accuracy by 26 percentage points on certain benchmarks. Verbose not always better.

TOKENS SAVED          ████████ 65% avg (up to 87%)
TECHNICAL ACCURACY    ████████ 100%
RESPONSE SPEED        ████████ faster (less to generate)
READABILITY           ████████ better (no wall of text)

Key Files

caveman/
├── SKILL.md          # the skill definition loaded by Claude Code
├── benchmarks/
│   ├── run_benchmarks.py   # reproduce token count results
│   └── compare.py          # side-by-side comparison tool
├── plugin.json             # Codex plugin manifest
└── README.md

Related skills

Setup Matt Pocock SkillsScaffold the per-repo configuration that Matt Pocock’s engineering agent skills rely on so they understand the issue tracker, triage labels, and domain documentation la462k185k

Lark Skill MakerQuickly turn any Lark/Feishu OpenAPI call or multi-step workflow into a reusable agent skill with its own SKILL.md.379k15.8k

CavemanSlash token usage by roughly 75% while keeping every technical detail intact when working with Claude Code, Cursor or similar agents.378k92.5k

Lark AppsConnect Claude, Cursor or custom agents directly to Lark (Feishu) for messaging, document automation, approval workflows and enterprise data access.375k

Running Claude Code Via Litellm CopilotRun Claude Code at a fraction of the cost by routing requests through LiteLLM to the GitHub Copilot Chat API.270k72

Codex PetGenerate a complete Codex Pet spritesheet and metadata from one reference image without needing an OpenAI key or Codex Pro.246k8

How it compares

Choose caveman-token-optimizer for conversational token shaving rather than prompt-engineering or context-pruning skills that restructure inputs.

FAQ

How much does caveman-token-optimizer reduce tokens?

caveman-token-optimizer reports roughly 65% average output token reduction on Claude Code and Codex replies, with up to 87% savings in favorable cases while keeping technical content accurate.

Does caveman mode change technical correctness?

caveman-token-optimizer compresses phrasing into caveman-speak but states that full technical accuracy is preserved, including commands, code, and debugging guidance.

Is Caveman Token Optimizer safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

AI & Agent Buildingagentsautomation

About

Caveman Token Optimizer by the numbers

Add your badge

How do you reduce Claude Code output token usage?

Who is it for?

When should I use this skill?

What you get

By the numbers

Files

Caveman Token Optimizer

What It Does

Install

Claude Code (npx)

Claude Code (plugin system)

Codex

Manual / Local

Usage — Trigger Commands

Claude Code

Codex

Natural language triggers

Disable

Intensity Levels

Benchmark Results

Reproducing Benchmarks

Code Examples — What Caveman Mode Changes

Before (normal, 69 tokens)

After (caveman full, 19 tokens)

Code blocks stay normal — caveman not stupid

What Caveman Preserves vs. Removes

Integration Pattern — Using in a Project

Session Workflow

Troubleshooting

Why It Works

Key Files

Links

Related skills

How it compares

FAQ

How much does caveman-token-optimizer reduce tokens?

Does caveman mode change technical correctness?

Is Caveman Token Optimizer safe to install?

This week in AI coding