Token Budget Advisor

Name: Token Budget Advisor
Author: affaan-m

affaan-m/everything-claude-code

Let the user pick response depth and token spend before the agent answers long or expensive threads.

Overview

Token Budget Advisor is a journey-wide agent skill that offers depth and token-budget choices before answering whenever the user explicitly wants to control response length or detail.

Install

npx skills add https://github.com/affaan-m/everything-claude-code --skill token-budget-advisor

What is this skill?

Triggers on explicit token-budget, depth, or short-vs-long answer requests—not auth/session “tokens”
Step 1 estimates input tokens before offering depth choices
Maintains a chosen depth silently for the rest of the session once the user sets it
Skips activation for trivial one-line answers or when depth was already specified
Bilingual trigger phrases (EN/ES) for budget and brevity control
Multi-step flow: estimate input tokens, then offer depth choice before the main answer

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 3.4k installs on skills.sh; 210k GitHub stars; 2/3 security scanners passed (skills.sh audits).

What problem does it solve?

You want a long agent answer sometimes but cannot afford open-ended max-detail replies on every prompt in a paid coding session.

Who is it for?

Power users on metered Claude/Cursor/Codex plans who routinely ask for TL;DR, partial depth, or token-aware answers.

Skip if: Sessions where you already named a depth level for this chat, one-line factual lookups, or discussions about auth/payment/session tokens.

When should I use this skill?

User explicitly wants token budget, response length, answer depth, short/brief/detailed/exhaustive variants, or Spanish phrasing for controlling how much the agent uses—not auth/session payment tokens.

What do I get? / Deliverables

The agent pauses to estimate input size, lets you pick depth upfront, then answers at that level for the rest of the session until you change it.

Depth choice prompt before the substantive answer
Session-consistent depth until the user changes it

Recommended Skills

Grill Memattpocock/skills

Grill Me is an agent skill that interviews you relentlessly about a plan or design until you and the agent share the sam…278k installs·121k stars

Grill With Docsmattpocock/skills

Grill With Docs is an agent skill that runs a structured grilling session on your plan: it interviews you relentlessly, …218k installs·121k stars

Brainstormingobra/superpowers

Brainstorming is a journey-wide Superpowers agent skill that turns rough ideas into approved designs through guided conv…209k installs·221k stars

Lark Tasklarksuite/cli

Lark task v2 skill for todos, tasklists, related/my task queries, attachments, and task-agent lifecycle, with guidance o…209k installs·13.7k stars

Lark Workflow Standup Reportlarksuite/cli

Lark workflow skill that pulls agenda events and incomplete tasks, then expects AI to time-convert, detect conflicts, an…208k installs·13.7k stars

Cavemanjuliusbrussee/blueprint

Caveman is a blueprint agent skill that re-encodes SPEC.md and spec-referencing writes into a terse grammar with symboli…197k installs·1k stars

Journey fit

Useful at every journey phase - explore requirements and options before committing to a direction.

Where it fits

Example use

BuildBackend, data & payments

Ask for a deep architecture review but cap follow-up explanations at brief depth to save tokens.

Example use

ShipCode review

Request an exhaustive security pass once, then maintain a shorter depth for nit fixes.

Example use

GrowContent & marketing

Draft marketing copy with an explicit ‘short version’ before expanding only the sections you approve.

Example use

OperateError tracking

Triage production logs with TL;DR summaries first, then escalate to detailed traces on one incident.

How it compares

Session meta-control for answer depth—not a linter, test runner, or codebase integration skill.

Common Questions / FAQ

Who is token-budget-advisor for?

Solo builders and agent power users who explicitly want to trade detail for token savings during long coding and planning sessions.

When should I use token-budget-advisor?

Whenever you mention token budget, response length, short vs detailed answers, or depth control—during Build implementation, Ship reviews, Grow content drafts, or Operate incident triage alike.

Is token-budget-advisor safe to install?

It is behavioral guidance only; confirm package integrity and audit results on this page’s Security Audits panel before enabling skills in production agents.

SKILL.md

READMESKILL.md - Token Budget Advisor

# Token Budget Advisor (TBA)

Intercept the response flow to offer the user a choice about response depth **before** Claude answers.

## When to Use

- User wants to control how long or detailed a response is
- User mentions tokens, budget, depth, or response length
- User says "short version", "tldr", "brief", "al 25%", "exhaustive", etc.
- Any time the user wants to choose depth/detail level upfront

**Do not trigger** when: user already set a level this session (maintain it silently), or the answer is trivially one line.

## How It Works

### Step 1 — Estimate input tokens

Use the repository's canonical context-budget heuristics to estimate the prompt's token count mentally.

Use the same calibration guidance as [context-budget](../context-budget/SKILL.md):

- prose: `words × 1.3`
- code-heavy or mixed/code blocks: `chars / 4`

For mixed content, use the dominant content type and keep the estimate heuristic.

### Step 2 — Estimate response size by complexity

Classify the prompt, then apply the multiplier range to get the full response window:

| Complexity   | Multiplier range | Example prompts                                      |
|--------------|------------------|------------------------------------------------------|
| Simple       | 3× – 8×          | "What is X?", yes/no, single fact                   |
| Medium       | 8× – 20×         | "How does X work?"                                  |
| Medium-High  | 10× – 25×        | Code request with context                           |
| Complex      | 15× – 40×        | Multi-part analysis, comparisons, architecture      |
| Creative     | 10× – 30×        | Stories, essays, narrative writing                  |

Response window = `input_tokens × mult_min` to `input_tokens × mult_max` (but don’t exceed your model’s configured output-token limit).

### Step 3 — Present depth options

Present this block **before** answering, using the actual estimated numbers:

```
Analyzing your prompt...

Input: ~[N] tokens  |  Type: [type]  |  Complexity: [level]  |  Language: [lang]

Choose your depth level:

[1] Essential   (25%)  ->  ~[tokens]   Direct answer only, no preamble
[2] Moderate    (50%)  ->  ~[tokens]   Answer + context + 1 example
[3] Detailed    (75%)  ->  ~[tokens]   Full answer with alternatives
[4] Exhaustive (100%)  ->  ~[tokens]   Everything, no limits

Which level? (1-4 or say "25% depth", "50% depth", "75% depth", "100% depth")

Precision: heuristic estimate ~85-90% accuracy (±15%).
```

Level token estimates (within the response window):
- 25%  → `min + (max - min) × 0.25`
- 50%  → `min + (max - min) × 0.50`
- 75%  → `min + (max - min) × 0.75`
- 100% → `max`

### Step 4 — Respond at the chosen level

| Level            | Target length       | Include                                             | Omit                                              |
|------------------|---------------------|-----------------------------------------------------|---------------------------------------------------|
| 25% Essential    | 2-4 sentences max   | Direct answer, key conclusion

What is this skill?

Triggers on explicit token-budget, depth, or short-vs-long answer requests—not auth/session “tokens”

Step 1 estimates input tokens before offering depth choices

Maintains a chosen depth silently for the rest of the session once the user sets it

Skips activation for trivial one-line answers or when depth was already specified

Bilingual trigger phrases (EN/ES) for budget and brevity control

Multi-step flow: estimate input tokens, then offer depth choice before the main answer

Compatible agents: Claude Code, Cursor, Codex, Windsurf

Adoption & trust: 3.4k installs on skills.sh; 210k GitHub stars; 2/3 security scanners passed (skills.sh audits).

Journey fit

Useful at every journey phase - explore requirements and options before committing to a direction.

Where it fits

Example use

BuildBackend, data & payments

Ask for a deep architecture review but cap follow-up explanations at brief depth to save tokens.

Example use

ShipCode review

Request an exhaustive security pass once, then maintain a shorter depth for nit fixes.

Example use

GrowContent & marketing

Draft marketing copy with an explicit ‘short version’ before expanding only the sections you approve.

Example use

OperateError tracking

Triage production logs with TL;DR summaries first, then escalate to detailed traces on one incident.

SKILL.md

READMESKILL.md - Token Budget Advisor

# Token Budget Advisor (TBA)

Intercept the response flow to offer the user a choice about response depth **before** Claude answers.

## When to Use

- User wants to control how long or detailed a response is
- User mentions tokens, budget, depth, or response length
- User says "short version", "tldr", "brief", "al 25%", "exhaustive", etc.
- Any time the user wants to choose depth/detail level upfront

**Do not trigger** when: user already set a level this session (maintain it silently), or the answer is trivially one line.

## How It Works

### Step 1 — Estimate input tokens

Use the repository's canonical context-budget heuristics to estimate the prompt's token count mentally.

Use the same calibration guidance as [context-budget](../context-budget/SKILL.md):

- prose: `words × 1.3`
- code-heavy or mixed/code blocks: `chars / 4`

For mixed content, use the dominant content type and keep the estimate heuristic.

### Step 2 — Estimate response size by complexity

Classify the prompt, then apply the multiplier range to get the full response window:

| Complexity   | Multiplier range | Example prompts                                      |
|--------------|------------------|------------------------------------------------------|
| Simple       | 3× – 8×          | "What is X?", yes/no, single fact                   |
| Medium       | 8× – 20×         | "How does X work?"                                  |
| Medium-High  | 10× – 25×        | Code request with context                           |
| Complex      | 15× – 40×        | Multi-part analysis, comparisons, architecture      |
| Creative     | 10× – 30×        | Stories, essays, narrative writing                  |

Response window = `input_tokens × mult_min` to `input_tokens × mult_max` (but don’t exceed your model’s configured output-token limit).

### Step 3 — Present depth options

Present this block **before** answering, using the actual estimated numbers:

```
Analyzing your prompt...

Input: ~[N] tokens  |  Type: [type]  |  Complexity: [level]  |  Language: [lang]

Choose your depth level:

[1] Essential   (25%)  ->  ~[tokens]   Direct answer only, no preamble
[2] Moderate    (50%)  ->  ~[tokens]   Answer + context + 1 example
[3] Detailed    (75%)  ->  ~[tokens]   Full answer with alternatives
[4] Exhaustive (100%)  ->  ~[tokens]   Everything, no limits

Which level? (1-4 or say "25% depth", "50% depth", "75% depth", "100% depth")

Precision: heuristic estimate ~85-90% accuracy (±15%).
```

Level token estimates (within the response window):
- 25%  → `min + (max - min) × 0.25`
- 50%  → `min + (max - min) × 0.50`
- 75%  → `min + (max - min) × 0.75`
- 100% → `max`

### Step 4 — Respond at the chosen level

| Level            | Target length       | Include                                             | Omit                                              |
|------------------|---------------------|-----------------------------------------------------|---------------------------------------------------|
| 25% Essential    | 2-4 sentences max   | Direct answer, key conclusion

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is token-budget-advisor for?

When should I use token-budget-advisor?

Is token-budget-advisor safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is token-budget-advisor for?

When should I use token-budget-advisor?

Is token-budget-advisor safe to install?

SKILL.md