Compression Strategy

Log pastes most often surface during incident response and production debugging, which maps to Operate. Error and trace investigation is where oversized logs routinely trigger context bloat.

Also useful

Also useful

Where it fits

Example use

Tail and grep production errors before pasting a trace into Claude for root-cause analysis.

Example use

Filter a failed GitHub Actions log to the last 100 lines plus error keywords before asking the agent for a fix.

Example use

Minimize hook or MCP server logs when iterating on local agent tooling without reloading megabytes of output.

How it compares

Use a structured hygiene workflow instead of reaching for gzip or full-file paste as the default.

Common Questions / FAQ

Who is compression-strategy for?

Solo and indie builders using Claude Code or similar agents who regularly paste CI, hook, or server logs into chat and need to protect context window budget.

When should I use compression-strategy?

Before pasting more than ~100 log lines, after a paste pushed context above ~50%, or when a teammate suggested compression before filtering—also during Ship CI triage and Build-time local debugging.

Is compression-strategy safe to install?

It is procedural documentation and shell examples for log handling; review the Security Audits panel on this page before installing any skill from the repo.

SKILL.md

READMESKILL.md - Compression Strategy

# Log Debugging Hygiene

When pasting logs into a Claude Code session, filter at the
source before you reach for compression. On the committed
`intake_queue.jsonl` fixture in this repo (a deterministic
synthetic intake log under `tests/fixtures/`), `tail -n 100` saves
95.0% of bytes and stays forensically useful because the output is
a literal subset of the log. `gzip -9` on the same fixture saves
65.0%, so filtering wins by 30 percentage points and needs no
extra tooling.

This module formalizes a three-tier workflow so that compression
is the last lever pulled, not the first.

## When to Use

Reach for this module when any of the following apply:

- About to paste more than 100 log lines into a session.
- A previous paste pushed context above 50% utilization.
- A debugging trace, CI failure, or hook log is the source of
  the bloat (not chat history or code).
- A teammate or skill recommended "compression" before any
  filtering was tried.

Skip this module if the paste is already under 2000 tokens or
if the entire log is needed verbatim for a regression report.

## Required TodoWrite Items

1. `log-debug:tier-1-filtered`
2. `log-debug:tier-2-minimized`
3. `log-debug:token-measured`

## Tier 1: Filter at Source (90-99% byte reduction)

Filtering keeps the lines you actually need and discards
the rest. This step delivers the largest savings and
requires no new tooling.

| Scenario | Command | Why |
|----------|---------|-----|
| Last N lines | `tail -n 100 file.log` | Most recent state |
| First N lines | `head -n 50 file.log` | Startup or init |
| Pattern matches | `rg -n "ERROR\|FAIL" file.log` | Targeted lines |
| Context around match | `rg -B 5 -A 10 "panic" file.log` | Surrounding rows |
| JSON field filter | `jq -c 'select(.level=="error")' f.jsonl` | Structured data |
| Time window | `awk '/14:23:00/,/14:24:00/' file.log` | Slice by stamp |
| Last N unique | `sort -u file.log \| tail -n 30` | Dedup then trim |

Benchmark on the committed fixture
`plugins/conserve/tests/fixtures/intake_queue.jsonl`
(0.89 MB, 2000 lines):

| Command | Output bytes | Reduction |
|---------|--------------|-----------|
| `tail -n 100` | 46 KB | 95.0% |
| `jq -c 'select(.tool_name)' \| tail -n 20` | 9 KB | 99.0% |
| `gzip -9` (compress all) | 319 KB | 65.0% |

Every row is reproducible from the committed fixture with stock
tools; the `tail` row is guarded by
`tests/test_log_debugging_hygiene.py::test_filter_first_claim_is_reproducible`.
The fixture is a deterministic synthetic intake log (high-entropy
per-line content, so `gzip` cannot collapse it the way it would a
repetitive log); it replaces an earlier benchmark that pointed at a
mutable runtime artifact. A dedicated log-template compressor
(logs-tokenizer, external and not bundled, see Tier 3) saved 70.3%
on the original real-traffic snapshot. Filtering still wins on byte
reduction, and by roughly 99x in absolute terms when `jq` can name
the relevant rows.

## Tier 2: Minimize Structurally (50-95% reduction)

When you cannot filter to specific lines, switch your tool to
its most compact output mode. Many tools have a flag for this.

| Tool | Verbose form | Compact form |
|------|--------------|--------------|
| `git log` | `git log` (full) | `git log --oneline -20` |
| `git diff` | `git diff` | `git diff --stat` |
| `pytest` | `pytest -v` | `pytest --tb=short -x` |
| `cargo build` | full output | `2>&1 \| rg "^(error\|warning)" \| head -50` |
| `jq` | pretty | `jq -c` (single-line) |
| `npm install` | full | `--silent` |
| Hook trace | full JSONL | `jq -c '{ts,event}'` projection |

Pair tier 2 with `head -n N` or `tail -n N` to bound the
output even after compaction.

## Tier 3: Compress as Fallback (60-80% reduction)

Use compression only when:

- You genuinely need every line (anomaly detection across a
  full trace, performance debugging where every timing
  matters, race condition analysis).
- Tier 1 and tier 2 cannot isolate the relevant subset.
- The compressed payload still fits

What is this skill?

Three-tier workflow: filter at source, then minimize, then compress only as a last lever

Required TodoWrite gates: tier-1-filtered, tier-2-minimized, token-measured

Fixture-backed guidance: tail -n 100 beats gzip -9 by ~30 percentage points on byte savings

Scenario table for tail, grep, jq, and time-window filters before any compression

Explicit skip rules when the full log must stay verbatim (e.g. regression reports)

Three-tier workflow with three required TodoWrite items

On repo fixture intake_queue.jsonl, tail -n 100 saves 95.0% bytes vs gzip -9 at 65.0%

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1 installs on skills.sh; 304 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

Log pastes most often surface during incident response and production debugging, which maps to Operate. Error and trace investigation is where oversized logs routinely trigger context bloat.

Also useful

Also useful

Where it fits

Example use

Tail and grep production errors before pasting a trace into Claude for root-cause analysis.

Example use

Filter a failed GitHub Actions log to the last 100 lines plus error keywords before asking the agent for a fix.

Example use