Summon

Name: Summon
Author: athola

athola/claude-night-market

Keep long-running egregore agent sessions inside token windows and recover cleanly from Claude API rate limits.

Overview

Summon is an agent skill for the Operate phase that manages egregore token budget windows, rate-limit detection, and cooldown-driven graceful shutdown.

Install

npx skills add https://github.com/athola/claude-night-market --skill summon

What is this skill?

Tracks cumulative token usage and session count inside a configurable budget window (default 5 hours).
Detects rate limits via HTTP 429, API error text, or explicit retry-after headers and stops work immediately.
Computes cooldown as retry-after minutes plus configurable padding (default 10 minutes) to avoid relaunch loops.
Falls back to a 30-minute default cooldown plus padding when retry-after is missing.
Persists window start, estimated tokens, last rate limit, and cooldown_until in structured JSON state.
Default budget window type 5h
Default cooldown padding 10 minutes
Default 30-minute cooldown when retry-after is absent

Compatible agents: Claude Code, any compatible agent

Adoption & trust: 1 installs on skills.sh; 304 GitHub stars; 0/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

What problem does it solve?

You run chained agent sessions but have no shared token window or safe cooldown when Claude rate-limits you mid-orchestration.

Who is it for?

Solo builders operating custom egregore or Night Market orchestrators across multiple agent sessions in one day.

Skip if: Builders who only use short single-turn Claude Code tasks and never hit cumulative token or rate-limit windows.

When should I use this skill?

Managing token budget windows, detecting API rate limits, or configuring cooldown before resuming egregore sessions.

What do I get? / Deliverables

After applying the protocol, the orchestrator persists budget state, enters a padded cooldown on limits, and avoids relaunching until the window is safe again.

Updated budget.json with window, tokens, and cooldown fields
Orchestrator stop and cooldown schedule on rate limit

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Primary fit

OperateInfrastructure & cost

Canonical shelf is Operate because budget windows, cooldowns, and graceful shutdown govern production-style agent orchestration after you are already running sessions. Infra fits persistent budget state in `.egregore/budget.json`, window timing, and orchestrator shutdown logic rather than feature coding.

How it compares

Use instead of ad-hoc sleep-and-retry loops that ignore shared session budget files.

Common Questions / FAQ

Who is summon for?

Summon is for indie developers and agent operators who run Athola egregore-style orchestration and need disciplined token and rate-limit handling across sessions.

When should I use summon?

Use summon during Operate when you maintain `.egregore/budget.json`, detect 429 or retry-after from the API, or need default 5-hour windows and padded cooldowns before resuming skill chains.

Is summon safe to install?

Review the Security Audits panel on this Prism page and inspect the skill source in your repo before letting an orchestrator write budget state or stop production runs.

SKILL.md

READMESKILL.md - Summon

# Budget Module

Manages token budget windows, detects rate limits, calculates
cooldown periods, and triggers graceful shutdown when limits
are reached.

## Budget Window

An egregore session operates within a budget window (default:
5 hours).
The window tracks cumulative token usage and rate limit
events across multiple sessions.

The budget state is in `.egregore/budget.json`:

```json
{
  "window_type": "5h",
  "window_started_at": "2026-03-04T10:00:00+00:00",
  "estimated_tokens_used": 0,
  "session_count": 1,
  "last_rate_limit_at": null,
  "cooldown_until": null
}
```

## Rate Limit Detection

The orchestrator detects rate limits through two signals:

1. **API error response**: a skill invocation fails with an
   HTTP 429 or a rate-limit error message from the Claude
   API.
2. **Explicit retry-after header**: the error includes a
   retry-after duration in seconds.

When either signal is detected, the orchestrator must stop
work immediately and enter cooldown.

## Cooldown Calculation

The cooldown duration is computed as follows:

```
cooldown_minutes = retry_after_seconds / 60
                 + config.budget.cooldown_padding_minutes
```

The padding (default: 10 minutes) prevents the watchdog from
relaunching too early and hitting the same rate limit again.

If no retry-after header is present, use a default cooldown
of 30 minutes plus padding.

## Rate Limit Recovery

When a rate limit is detected:

1. **Save manifest**: write the current pipeline state to
   `.egregore/manifest.json` so no progress is lost.
2. **Record rate limit**: call
   `budget.record_rate_limit(cooldown_minutes)` to update
   the budget state.
3. **Save budget**: write `.egregore/budget.json` with the
   updated cooldown timestamp.
4. **Alert overseer**: send a notification via the configured
   channel (see `notify.py`) with the rate limit details and
   expected resume time.

### In-Session Recovery (2.1.71+, all providers 2.1.73+)

Use `CronCreate` to schedule a one-shot resume at the
cooldown expiry time. As of 2.1.73, `/loop` and
`CronCreate` are available on Bedrock, Vertex, Foundry,
and with telemetry disabled (previously first-party API
only).

```
CronCreate(
  cron: "<min> <hour> * * *",
  prompt: "Cooldown expired. Read .egregore/manifest.json
    and resume the pipeline. Invoke
    Skill(egregore:summon) to continue.",
  recurring: false
)
```

**Advantages over watchdog restart:**

- Session stays alive: no context loss, no manifest
  re-read overhead, no fresh session startup cost
- Exact timing: fires at cooldown_until instead of
  polling every 5 minutes
- No OS-level setup: works without launchd/systemd

The session remains idle between the rate limit and the
scheduled prompt. When the cron fires, the orchestration
loop resumes with the full conversation context intact.

### Fallback: Exit and Watchdog

If `CronCreate` is unavailable (pre-2.1.71, or
pre-2.1.73 on Bedrock/Vertex/Foundry), the cooldown
exceeds 7 days (cron task auto-expiry), or the session
itself needs to exit for other reasons:

5. **Exit cleanly**: exit with code 0. A non-zero exit
   would trigger the watchdog's crash handler instead of
   the cooldown-aware restart path.

The watchdog checks `budget.json` before relaunching and
waits until the cooldown expires.

## Pre-Launch Cooldown Check

Before starting any work, the orchestrator must check:

```python
if is_in_cooldown(budget):
    # Do not start. Exit and let the watchdog retry later.
    sys.exit(0)
```

The watchdog also performs this check before launching a new
session.
This double-check prevents races where the watchdog reads a
stale budget file.

## Window Reset

The budget window resets when `window_started_at` is older
than the configured `window_type` duration.
On reset:

1. Set `estimated_tokens_used` to 0.
2. Set `session_count` to 0.
3. Set `window_started_at` to the current time.
4. Clear `last_rate

What is this skill?

Tracks cumulative token usage and session count inside a configurable budget window (default 5 hours).

Detects rate limits via HTTP 429, API error text, or explicit retry-after headers and stops work immediately.

Computes cooldown as retry-after minutes plus configurable padding (default 10 minutes) to avoid relaunch loops.

Falls back to a 30-minute default cooldown plus padding when retry-after is missing.

Persists window start, estimated tokens, last rate limit, and cooldown_until in structured JSON state.

Default budget window type 5h

Default cooldown padding 10 minutes

Default 30-minute cooldown when retry-after is absent

Compatible agents: Claude Code, any compatible agent

Adoption & trust: 1 installs on skills.sh; 304 GitHub stars; 0/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Primary fit

OperateInfrastructure & cost

SKILL.md

READMESKILL.md - Summon

# Budget Module

Manages token budget windows, detects rate limits, calculates
cooldown periods, and triggers graceful shutdown when limits
are reached.

## Budget Window

An egregore session operates within a budget window (default:
5 hours).
The window tracks cumulative token usage and rate limit
events across multiple sessions.

The budget state is in `.egregore/budget.json`:

```json
{
  "window_type": "5h",
  "window_started_at": "2026-03-04T10:00:00+00:00",
  "estimated_tokens_used": 0,
  "session_count": 1,
  "last_rate_limit_at": null,
  "cooldown_until": null
}
```

## Rate Limit Detection

The orchestrator detects rate limits through two signals:

1. **API error response**: a skill invocation fails with an
   HTTP 429 or a rate-limit error message from the Claude
   API.
2. **Explicit retry-after header**: the error includes a
   retry-after duration in seconds.

When either signal is detected, the orchestrator must stop
work immediately and enter cooldown.

## Cooldown Calculation

The cooldown duration is computed as follows:

```
cooldown_minutes = retry_after_seconds / 60
                 + config.budget.cooldown_padding_minutes
```

The padding (default: 10 minutes) prevents the watchdog from
relaunching too early and hitting the same rate limit again.

If no retry-after header is present, use a default cooldown
of 30 minutes plus padding.

## Rate Limit Recovery

When a rate limit is detected:

1. **Save manifest**: write the current pipeline state to
   `.egregore/manifest.json` so no progress is lost.
2. **Record rate limit**: call
   `budget.record_rate_limit(cooldown_minutes)` to update
   the budget state.
3. **Save budget**: write `.egregore/budget.json` with the
   updated cooldown timestamp.
4. **Alert overseer**: send a notification via the configured
   channel (see `notify.py`) with the rate limit details and
   expected resume time.

### In-Session Recovery (2.1.71+, all providers 2.1.73+)

Use `CronCreate` to schedule a one-shot resume at the
cooldown expiry time. As of 2.1.73, `/loop` and
`CronCreate` are available on Bedrock, Vertex, Foundry,
and with telemetry disabled (previously first-party API
only).

```
CronCreate(
  cron: "<min> <hour> * * *",
  prompt: "Cooldown expired. Read .egregore/manifest.json
    and resume the pipeline. Invoke
    Skill(egregore:summon) to continue.",
  recurring: false
)
```

**Advantages over watchdog restart:**

- Session stays alive: no context loss, no manifest
  re-read overhead, no fresh session startup cost
- Exact timing: fires at cooldown_until instead of
  polling every 5 minutes
- No OS-level setup: works without launchd/systemd

The session remains idle between the rate limit and the
scheduled prompt. When the cron fires, the orchestration
loop resumes with the full conversation context intact.

### Fallback: Exit and Watchdog

If `CronCreate` is unavailable (pre-2.1.71, or
pre-2.1.73 on Bedrock/Vertex/Foundry), the cooldown
exceeds 7 days (cron task auto-expiry), or the session
itself needs to exit for other reasons:

5. **Exit cleanly**: exit with code 0. A non-zero exit
   would trigger the watchdog's crash handler instead of
   the cooldown-aware restart path.

The watchdog checks `budget.json` before relaunching and
waits until the cooldown expires.

## Pre-Launch Cooldown Check

Before starting any work, the orchestrator must check:

```python
if is_in_cooldown(budget):
    # Do not start. Exit and let the watchdog retry later.
    sys.exit(0)
```

The watchdog also performs this check before launching a new
session.
This double-check prevents races where the watchdog reads a
stale budget file.

## Window Reset

The budget window resets when `window_started_at` is older
than the configured `window_type` duration.
On reset:

1. Set `estimated_tokens_used` to 0.
2. Set `session_count` to 0.
3. Set `window_started_at` to the current time.
4. Clear `last_rate

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is summon for?

When should I use summon?

Is summon safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Who is summon for?

When should I use summon?

Is summon safe to install?

SKILL.md