
Quota Management
Estimate tokens and API cost from files and task types before kicking off large agent runs so you stay inside quota and budget.
Install
npx skills add https://github.com/athola/claude-night-market --skill quota-managementWhat is this skill?
- File-based token estimation with suffix-specific character-to-token ratios (code, JSON, text)
- Task-based ranges for analysis, summarization, pattern extraction, and boilerplate generation
- USD cost helper using per-model input and output rates (e.g. gemini-pro, gemini-flash, qwen-max)
- estimate_file_tokens(Path) pattern sized for pre-flight checks in scripts
- Documented estimated_tokens: 450 in skill metadata for lightweight invocation
Adoption & trust: 1 installs on skills.sh; 304 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
Recommended Skills
Journey fit
Quota and estimation discipline matters most while wiring agent workflows and batch automation, where unbounded context burns budget silently. File-based token ratios and per-task token tables are agent-tooling primitives, not shipping or marketing deliverables.
Common Questions / FAQ
Is Quota Management safe to install?
skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Quota Management
# Estimation Patterns ## Token Estimation ### File-Based Estimation ```python # Tokens per character ratios by file type TOKEN_RATIOS = { "code": 3.2, # .py, .js, .ts, .go, .rs "json": 3.6, # .json, .yaml, .toml "text": 4.2, # .md, .txt, .rst "default": 4.0 } def estimate_file_tokens(path: Path) -> int: """Estimate tokens for a file.""" size = path.stat().st_size suffix = path.suffix.lower() if suffix in [".py", ".js", ".ts", ".go", ".rs"]: ratio = TOKEN_RATIOS["code"] elif suffix in [".json", ".yaml", ".yml", ".toml"]: ratio = TOKEN_RATIOS["json"] else: ratio = TOKEN_RATIOS["text"] return int(size / ratio) ``` ### Task-Based Estimation | Task Type | Input Tokens | Output Tokens | |-----------|--------------|---------------| | File analysis | 15-50/file | 200-500 | | Code summarization | 1-3% of source | 300-800 | | Pattern extraction | 5-20/match | 100-300 | | Boilerplate generation | 50-200/template | Varies | ## Cost Estimation ### Cost Calculation ```python def estimate_cost( input_tokens: int, output_tokens: int, model: str ) -> float: """Estimate cost in USD.""" rates = { "gemini-pro": {"input": 0.50, "output": 1.50}, "gemini-flash": {"input": 0.075, "output": 0.30}, "qwen-max": {"input": 0.40, "output": 1.20}, } rate = rates.get(model, rates["gemini-pro"]) input_cost = (input_tokens / 1_000_000) * rate["input"] output_cost = (output_tokens / 1_000_000) * rate["output"] return input_cost + output_cost ``` ### Cost Thresholds | Category | Cost Range | Example Operations | |----------|------------|-------------------| | Low | <$0.01 | Pattern counting, imports extraction | | Medium | $0.01-$0.10 | Module summarization, code analysis | | High | >$0.10 | Full codebase review, documentation | ## Pre-Flight Checks ### Estimation Workflow ```python def preflight_check(files: list[Path], prompt: str) -> dict: """Estimate resources before operation.""" input_tokens = sum(estimate_file_tokens(f) for f in files) input_tokens += len(prompt) // 4 # Prompt tokens output_tokens = estimate_output_tokens(task_type) cost = estimate_cost(input_tokens, output_tokens, model) return { "input_tokens": input_tokens, "output_tokens": output_tokens, "estimated_cost": cost, "within_quota": can_handle_task(input_tokens) } ``` --- name: threshold-strategies description: Strategies for handling quota thresholds and graceful degradation estimated_tokens: 400 --- # Threshold Strategies ## Degradation Patterns ### Progressive Degradation ```python def get_degradation_strategy(usage_percent: float) -> str: if usage_percent < 80: return "full_operation" elif usage_percent < 90: return "reduce_batch_size" elif usage_percent < 95: return "essential_only" else: return "defer_or_secondary" ``` ### Batch Size Adjustment | Threshold | Batch Size | Rationale | |-----------|------------|-----------| | <80% | 100% | Full capacity available | | 80-90% | 50% | Conserve for critical ops | | 90-95% | 25% | Minimal batches only | | >95% | 0% | Defer all batches | ## Recovery Strategies ### Wait for Reset ```python def wait_for_reset(quota_type: str) -> int: """Returns seconds until quota resets.""" reset_times = { "rpm": 60, # Per-minute resets "tpm": 60, # Token per minute "daily": seconds_until_midnight() } return reset_times.get(quota_type, 3600) ``` ### Secondary Services When primary service is at capacity: 1. Check alternative service quota 2. Use cached results if available 3. Return partial results with warning 4. Queue for later execution ## Alerting Patterns ### Threshold Notifications