Quota Management

Name: Quota Management
Author: athola

athola/claude-night-market

Estimate tokens and API cost from files and task types before kicking off large agent runs so you stay inside quota and budget.

Install

npx skills add https://github.com/athola/claude-night-market --skill quota-management

What is this skill?

File-based token estimation with suffix-specific character-to-token ratios (code, JSON, text)
Task-based ranges for analysis, summarization, pattern extraction, and boilerplate generation
USD cost helper using per-model input and output rates (e.g. gemini-pro, gemini-flash, qwen-max)
estimate_file_tokens(Path) pattern sized for pre-flight checks in scripts
Documented estimated_tokens: 450 in skill metadata for lightweight invocation

Adoption & trust: 1 installs on skills.sh; 304 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Primary fit

BuildAgent skills & templates

Quota and estimation discipline matters most while wiring agent workflows and batch automation, where unbounded context burns budget silently. File-based token ratios and per-task token tables are agent-tooling primitives, not shipping or marketing deliverables.

Common Questions / FAQ

Is Quota Management safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

SKILL.md

READMESKILL.md - Quota Management

# Estimation Patterns

## Token Estimation

### File-Based Estimation

```python
# Tokens per character ratios by file type
TOKEN_RATIOS = {
    "code": 3.2,      # .py, .js, .ts, .go, .rs
    "json": 3.6,      # .json, .yaml, .toml
    "text": 4.2,      # .md, .txt, .rst
    "default": 4.0
}

def estimate_file_tokens(path: Path) -> int:
    """Estimate tokens for a file."""
    size = path.stat().st_size
    suffix = path.suffix.lower()

    if suffix in [".py", ".js", ".ts", ".go", ".rs"]:
        ratio = TOKEN_RATIOS["code"]
    elif suffix in [".json", ".yaml", ".yml", ".toml"]:
        ratio = TOKEN_RATIOS["json"]
    else:
        ratio = TOKEN_RATIOS["text"]

    return int(size / ratio)
```

### Task-Based Estimation

| Task Type | Input Tokens | Output Tokens |
|-----------|--------------|---------------|
| File analysis | 15-50/file | 200-500 |
| Code summarization | 1-3% of source | 300-800 |
| Pattern extraction | 5-20/match | 100-300 |
| Boilerplate generation | 50-200/template | Varies |

## Cost Estimation

### Cost Calculation

```python
def estimate_cost(
    input_tokens: int,
    output_tokens: int,
    model: str
) -> float:
    """Estimate cost in USD."""
    rates = {
        "gemini-pro": {"input": 0.50, "output": 1.50},
        "gemini-flash": {"input": 0.075, "output": 0.30},
        "qwen-max": {"input": 0.40, "output": 1.20},
    }

    rate = rates.get(model, rates["gemini-pro"])
    input_cost = (input_tokens / 1_000_000) * rate["input"]
    output_cost = (output_tokens / 1_000_000) * rate["output"]

    return input_cost + output_cost
```

### Cost Thresholds

| Category | Cost Range | Example Operations |
|----------|------------|-------------------|
| Low | <$0.01 | Pattern counting, imports extraction |
| Medium | $0.01-$0.10 | Module summarization, code analysis |
| High | >$0.10 | Full codebase review, documentation |

## Pre-Flight Checks

### Estimation Workflow

```python
def preflight_check(files: list[Path], prompt: str) -> dict:
    """Estimate resources before operation."""
    input_tokens = sum(estimate_file_tokens(f) for f in files)
    input_tokens += len(prompt) // 4  # Prompt tokens

    output_tokens = estimate_output_tokens(task_type)
    cost = estimate_cost(input_tokens, output_tokens, model)

    return {
        "input_tokens": input_tokens,
        "output_tokens": output_tokens,
        "estimated_cost": cost,
        "within_quota": can_handle_task(input_tokens)
    }
```


---
name: threshold-strategies
description: Strategies for handling quota thresholds and graceful degradation
estimated_tokens: 400
---

# Threshold Strategies

## Degradation Patterns

### Progressive Degradation

```python
def get_degradation_strategy(usage_percent: float) -> str:
    if usage_percent < 80:
        return "full_operation"
    elif usage_percent < 90:
        return "reduce_batch_size"
    elif usage_percent < 95:
        return "essential_only"
    else:
        return "defer_or_secondary"
```

### Batch Size Adjustment

| Threshold | Batch Size | Rationale |
|-----------|------------|-----------|
| <80% | 100% | Full capacity available |
| 80-90% | 50% | Conserve for critical ops |
| 90-95% | 25% | Minimal batches only |
| >95% | 0% | Defer all batches |

## Recovery Strategies

### Wait for Reset
```python
def wait_for_reset(quota_type: str) -> int:
    """Returns seconds until quota resets."""
    reset_times = {
        "rpm": 60,           # Per-minute resets
        "tpm": 60,           # Token per minute
        "daily": seconds_until_midnight()
    }
    return reset_times.get(quota_type, 3600)
```

### Secondary Services
When primary service is at capacity:
1. Check alternative service quota
2. Use cached results if available
3. Return partial results with warning
4. Queue for later execution

## Alerting Patterns

### Threshold Notifications

What is this skill?

File-based token estimation with suffix-specific character-to-token ratios (code, JSON, text)

Task-based ranges for analysis, summarization, pattern extraction, and boilerplate generation

USD cost helper using per-model input and output rates (e.g. gemini-pro, gemini-flash, qwen-max)

estimate_file_tokens(Path) pattern sized for pre-flight checks in scripts

Documented estimated_tokens: 450 in skill metadata for lightweight invocation

Adoption & trust: 1 installs on skills.sh; 304 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Primary fit

BuildAgent skills & templates

SKILL.md

READMESKILL.md - Quota Management

# Estimation Patterns

## Token Estimation

### File-Based Estimation

```python
# Tokens per character ratios by file type
TOKEN_RATIOS = {
    "code": 3.2,      # .py, .js, .ts, .go, .rs
    "json": 3.6,      # .json, .yaml, .toml
    "text": 4.2,      # .md, .txt, .rst
    "default": 4.0
}

def estimate_file_tokens(path: Path) -> int:
    """Estimate tokens for a file."""
    size = path.stat().st_size
    suffix = path.suffix.lower()

    if suffix in [".py", ".js", ".ts", ".go", ".rs"]:
        ratio = TOKEN_RATIOS["code"]
    elif suffix in [".json", ".yaml", ".yml", ".toml"]:
        ratio = TOKEN_RATIOS["json"]
    else:
        ratio = TOKEN_RATIOS["text"]

    return int(size / ratio)
```

### Task-Based Estimation

| Task Type | Input Tokens | Output Tokens |
|-----------|--------------|---------------|
| File analysis | 15-50/file | 200-500 |
| Code summarization | 1-3% of source | 300-800 |
| Pattern extraction | 5-20/match | 100-300 |
| Boilerplate generation | 50-200/template | Varies |

## Cost Estimation

### Cost Calculation

```python
def estimate_cost(
    input_tokens: int,
    output_tokens: int,
    model: str
) -> float:
    """Estimate cost in USD."""
    rates = {
        "gemini-pro": {"input": 0.50, "output": 1.50},
        "gemini-flash": {"input": 0.075, "output": 0.30},
        "qwen-max": {"input": 0.40, "output": 1.20},
    }

    rate = rates.get(model, rates["gemini-pro"])
    input_cost = (input_tokens / 1_000_000) * rate["input"]
    output_cost = (output_tokens / 1_000_000) * rate["output"]

    return input_cost + output_cost
```

### Cost Thresholds

| Category | Cost Range | Example Operations |
|----------|------------|-------------------|
| Low | <$0.01 | Pattern counting, imports extraction |
| Medium | $0.01-$0.10 | Module summarization, code analysis |
| High | >$0.10 | Full codebase review, documentation |

## Pre-Flight Checks

### Estimation Workflow

```python
def preflight_check(files: list[Path], prompt: str) -> dict:
    """Estimate resources before operation."""
    input_tokens = sum(estimate_file_tokens(f) for f in files)
    input_tokens += len(prompt) // 4  # Prompt tokens

    output_tokens = estimate_output_tokens(task_type)
    cost = estimate_cost(input_tokens, output_tokens, model)

    return {
        "input_tokens": input_tokens,
        "output_tokens": output_tokens,
        "estimated_cost": cost,
        "within_quota": can_handle_task(input_tokens)
    }
```


---
name: threshold-strategies
description: Strategies for handling quota thresholds and graceful degradation
estimated_tokens: 400
---

# Threshold Strategies

## Degradation Patterns

### Progressive Degradation

```python
def get_degradation_strategy(usage_percent: float) -> str:
    if usage_percent < 80:
        return "full_operation"
    elif usage_percent < 90:
        return "reduce_batch_size"
    elif usage_percent < 95:
        return "essential_only"
    else:
        return "defer_or_secondary"
```

### Batch Size Adjustment

| Threshold | Batch Size | Rationale |
|-----------|------------|-----------|
| <80% | 100% | Full capacity available |
| 80-90% | 50% | Conserve for critical ops |
| 90-95% | 25% | Minimal batches only |
| >95% | 0% | Defer all batches |

## Recovery Strategies

### Wait for Reset
```python
def wait_for_reset(quota_type: str) -> int:
    """Returns seconds until quota resets."""
    reset_times = {
        "rpm": 60,           # Per-minute resets
        "tpm": 60,           # Token per minute
        "daily": seconds_until_midnight()
    }
    return reset_times.get(quota_type, 3600)
```

### Secondary Services
When primary service is at capacity:
1. Check alternative service quota
2. Use cached results if available
3. Return partial results with warning
4. Queue for later execution

## Alerting Patterns

### Threshold Notifications

Install

What is this skill?

Recommended Skills

Journey fit

Is Quota Management safe to install?

SKILL.md

This week for builders

Install

What is this skill?

Recommended Skills

Journey fit

Is Quota Management safe to install?

SKILL.md