
Delegation Core
Estimate token spend and apply a cost-benefit gate before routing work to external LLMs like Gemini or Qwen.
Overview
delegation-core is an agent skill most often used in Operate (also Build) that supplies token cost tables, budget tiers, and a benefit-versus-cost decision framework for external LLM delegation.
Install
npx skills add https://github.com/athola/claude-night-market --skill delegation-coreWhat is this skill?
- Per-1M-token rate tables for Gemini 2.0 Pro/Flash and Qwen with context-window notes
- Cost-benefit formula: delegate only when benefit exceeds cost × 3 safety margin
- Tiered examples from sub-cent counts to $0.10+ large-context reviews
- Parent skill conjure:delegation-core with dependencies leyline:quota-management and leyline:usage-logging
- Estimated ~250 tokens in the skill package for quick reference during delegation decisions
- Benefit must exceed Cost × 3 before delegating per the documented safety margin
- Gemini 2.0 Flash listed at $0.075 input and $0.30 output per 1M tokens
- Skill frontmatter estimates ~250 tokens for the package body
Adoption & trust: 1 installs on skills.sh; 304 GitHub stars; 1/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).
What problem does it solve?
You want to offload work to another model but have no consistent way to compare token rates or decide if delegation is worth the bill.
Who is it for?
Agent builders who routinely delegate scans, summaries, or codegen to Gemini Flash/Pro or Qwen and need a repeatable cost gate.
Skip if: Teams that only use a single in-agent model with no external API billing, or who already enforce hard monthly caps without per-task estimation.
When should I use this skill?
Before routing repetitive file scans, large summarization, or bulk generation to Gemini, Qwen, or other billed external models.
What do I get? / Deliverables
You can price a delegation in dollars, apply the 3× safety-margin rule, and align spend with leyline quota and usage logging before the call runs.
- Dollar estimate for a planned delegation from input/output token counts
- Go/no-go decision using the cost-benefit ratio with 3× margin
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Operate because quota, logging, and per-token rates matter most once delegation runs in production, even though you may read this while designing agent workflows in Build. Infra fits budget guidelines, service rate tables, and integration with quota-management and usage-logging as operational controls for agent spend.
Where it fits
Compare Flash versus Pro pricing before wiring a sub-agent to summarize fifty files on every PR.
Align monthly spend with leyline quota-management after usage-logging shows spike in delegated reviews.
Drop high-cost delegation patterns that fail the benefit-greater-than-3×-cost check during iteration planning.
How it compares
Use for token-dollar math on delegated LLM calls, not for writing API reference docs or implementing barcode capture SDKs.
Common Questions / FAQ
Who is delegation-core for?
Solo and indie builders shipping agent workflows who pay per token on external models and want delegation-core’s rates and cost-benefit rule before firing another API.
When should I use delegation-core?
Use it in Operate when reviewing agent bills and quotas, in Build when designing which tasks to route to Flash versus Pro, and before any medium- or high-cost delegation (summaries, architecture passes, bulk codegen).
Is delegation-core safe to install?
It is reference math and guidelines; review the Security Audits panel on this Prism page and treat any real API keys only in your own quota/logging setup, not inside the skill text.
Workflow Chain
Requires first: leyline quota management, leyline usage logging
SKILL.md
READMESKILL.md - Delegation Core
# Cost Estimation and Budget Guidelines ## Service Cost Comparisons **Gemini 2.0 Models (per 1M tokens):** - Input: $0.50, Output: $1.50 (Pro version) - Input: $0.075, Output: $0.30 (Flash version) - Context: Up to 1M tokens **Qwen Models (per 1M tokens):** - Input: $0.20-0.50, Output: $0.60-1.20 (varies by provider) - Context: Up to 100K+ tokens - Sandbox execution: Typically $0.001-0.01 per request ## Cost Decision Framework **Calculate Cost-Benefit Ratio:** ``` Cost = (input_tokens * input_rate) + (output_tokens * output_rate) Benefit = time_saved * hourly_rate + quality_improvement_value Delegate if: Benefit > Cost * 3 (safety margin for quality risks) ``` ## Practical Cost Examples **Low-Cost Delegations (<$0.01):** - Count function occurrences: 50 files × 30 tokens = $0.000015 - Extract import statements: 100 files × 50 tokens = $0.000025 - Generate 10 boilerplate files: ~2K output tokens = $0.003 **Medium-Cost Delegations ($0.01-0.10):** - Summarize 50K lines of code: ~125K tokens = $0.06-0.19 - Analyze architecture of 100 files: ~80K tokens = $0.04-0.12 - Generate 20 API endpoints: ~3K output tokens = $0.005 **High-Cost Delegations ($0.10+):** - Review entire codebase (500K+ tokens): $0.25-0.75 - Generate detailed documentation: $0.15-0.45 - Complex refactoring analysis: $0.20-0.60 ## Cost Optimization Strategies **Input Optimization:** - Remove comments, tests, examples when not needed - Use selective file patterns instead of entire directories - Pre-filter with grep/awk for relevant content - Compress multiple small queries into one request **Model Selection:** - Use Flash/cheaper models for simple extraction tasks - Reserve Pro models for complex analysis only - Consider batch processing for repetitive tasks ### Cheapest-Capable Model Selection When dispatching subagents, select the cheapest model that can handle the task. This is a recommendation, not a mandate; override when judgment dictates. | Task Type | Has Detailed Plan? | Recommended Model | |-----------|-------------------|-------------------| | Implementation | Yes | haiku | | Implementation | No | sonnet | | Planning/reasoning | Any | sonnet/opus | | Security/safety review | Any | sonnet minimum, prefer opus | | Code review | Any | sonnet minimum | **Security/safety task types** (never downgrade): - Security audit - Secret scanning - Permissions analysis - Auth-critical review - Dependency vulnerability scanning If a code review surfaces security-relevant findings, the reviewer should note "security-relevant" in its output to prevent downstream model downgrade. **Fallback**: When a downgrade rule triggers but the task type is ambiguous, default to sonnet. **Rationale**: Implementation tasks with detailed plans are well-scoped and predictable; haiku handles these effectively. Planning and security tasks require reasoning depth that cheaper models may lack. **Alternative Strategies:** - Break large tasks into smaller, targeted analyses - Use local processing for sensitive operations - Cache results for repeated analysis requests ## Cost Monitoring **Set Daily/Weekly Budgets:** - Development: $1-5/day - Batch processing: $10-50/month - Enterprise: $100-500/month **Tracking Methods:** - Use built-in usage logging tools - Monitor API dashboard for consumption - Set up alerts for unexpected spikes **Using Leyline for Cost Tracking:** ```python from leyline.quota_tracker import QuotaTracker from leyline.usage_logger import UsageLogger # Initialize for your service tracker = QuotaTracker(service="gemini") logger = UsageLogger(service="gemini") # Check quota before operation level, warnings = tracker.get_quota_status() if level == "c