
Systematic Debugging
Give your coding agent a repeatable debugging ritual instead of guessing fixes when tests flake or bugs resist one-shot patches.
Overview
Systematic debugging is an agent skill most often used in Ship (also Build, Operate) that guides structured root-cause work—especially stabilizing flaky async and agent-thread tests—instead of arbitrary timeouts and gues
Install
npx skills add https://github.com/sickn33/antigravity-awesome-skills --skill systematic-debuggingWhat is this skill?
- Condition-based waiting utilities that poll thread/event state instead of fixed sleeps
- Documented win: 15 flaky tests stabilized by replacing arbitrary timeouts
- Patterns for waitForEvent and multi-event synchronization in agent/thread test harnesses
- Encourages root-cause investigation before changing production code
- TypeScript-oriented examples you can adapt to Lace-style thread managers
- Documented case: 15 flaky tests fixed by replacing arbitrary timeouts
- Default event wait timeout example: 5000ms with 10ms poll interval
Adoption & trust: 1.1k installs on skills.sh; 40.1k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You keep patching symptoms—extra delays, broad try/catch, or disabling tests—because the real timing or state bug never gets isolated.
Who is it for?
Indie builders debugging async agents, integration tests, or race-prone UI/API flows who want a methodical loop before merging.
Skip if: Teams that only need a one-line typo fix or already have a signed-off postmortem and reproduction steps—skip heavy process there.
When should I use this skill?
Tests fail intermittently, agent threads miss events, or you are about to add another arbitrary delay instead of verifying state.
What do I get? / Deliverables
You replace flaky timeouts with condition-based waits, confirm the failing invariant, and land a fix backed by evidence so CI and local runs stay green.
- Reproduction steps with a condition-based wait helper
- Green test run or narrowed root-cause hypothesis
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Canonical shelf is Ship/testing because the documented pattern fixes flaky tests and replaces arbitrary timeouts with condition-based waits—work you do before you trust a release. Testing subphase fits async race conditions, event polling utilities, and stabilizing CI—exactly where systematic debugging pays off first.
Where it fits
Narrow a race between tool calls and results in a local agent harness before wiring the real API.
Stabilize CI by swapping fixed delays for event-type polling with explicit timeout errors.
Reproduce an intermittent production failure with the same wait-for-state pattern used in tests.
How it compares
Use instead of ad-hoc “add a 5s sleep” chat debugging when failures are intermittent or environment-sensitive.
Common Questions / FAQ
Who is systematic-debugging for?
Solo and indie developers using AI coding agents on TypeScript or agent-thread test stacks who struggle with flaky CI and unclear failure modes.
When should I use systematic-debugging?
During Ship when tests fail randomly; during Build when integrating async event pipelines; during Operate when production errors need reproduction without guessing; whenever a fix would otherwise be a blind timeout or log spam.
Is systematic-debugging safe to install?
It is procedural guidance and example utilities—review the Security Audits panel on this skill’s Prism page before trusting any community package in your repo or CI.
SKILL.md
READMESKILL.md - Systematic Debugging
// Complete implementation of condition-based waiting utilities // From: Lace test infrastructure improvements (2025-10-03) // Context: Fixed 15 flaky tests by replacing arbitrary timeouts import type { ThreadManager } from '~/threads/thread-manager'; import type { LaceEvent, LaceEventType } from '~/threads/types'; /** * Wait for a specific event type to appear in thread * * @param threadManager - The thread manager to query * @param threadId - Thread to check for events * @param eventType - Type of event to wait for * @param timeoutMs - Maximum time to wait (default 5000ms) * @returns Promise resolving to the first matching event * * Example: * await waitForEvent(threadManager, agentThreadId, 'TOOL_RESULT'); */ export function waitForEvent( threadManager: ThreadManager, threadId: string, eventType: LaceEventType, timeoutMs = 5000 ): Promise<LaceEvent> { return new Promise((resolve, reject) => { const startTime = Date.now(); const check = () => { const events = threadManager.getEvents(threadId); const event = events.find((e) => e.type === eventType); if (event) { resolve(event); } else if (Date.now() - startTime > timeoutMs) { reject(new Error(`Timeout waiting for ${eventType} event after ${timeoutMs}ms`)); } else { setTimeout(check, 10); // Poll every 10ms for efficiency } }; check(); }); } /** * Wait for a specific number of events of a given type * * @param threadManager - The thread manager to query * @param threadId - Thread to check for events * @param eventType - Type of event to wait for * @param count - Number of events to wait for * @param timeoutMs - Maximum time to wait (default 5000ms) * @returns Promise resolving to all matching events once count is reached * * Example: * // Wait for 2 AGENT_MESSAGE events (initial response + continuation) * await waitForEventCount(threadManager, agentThreadId, 'AGENT_MESSAGE', 2); */ export function waitForEventCount( threadManager: ThreadManager, threadId: string, eventType: LaceEventType, count: number, timeoutMs = 5000 ): Promise<LaceEvent[]> { return new Promise((resolve, reject) => { const startTime = Date.now(); const check = () => { const events = threadManager.getEvents(threadId); const matchingEvents = events.filter((e) => e.type === eventType); if (matchingEvents.length >= count) { resolve(matchingEvents); } else if (Date.now() - startTime > timeoutMs) { reject( new Error( `Timeout waiting for ${count} ${eventType} events after ${timeoutMs}ms (got ${matchingEvents.length})` ) ); } else { setTimeout(check, 10); } }; check(); }); } /** * Wait for an event matching a custom predicate * Useful when you need to check event data, not just type * * @param threadManager - The thread manager to query * @param threadId - Thread to check for events * @param predicate - Function that returns true when event matches * @param description - Human-readable description for error messages * @param timeoutMs - Maximum time to wait (default 5000ms) * @returns Promise resolving to the first matching event * * Example: * // Wait for TOOL_RESULT with specific ID * await waitForEventMatch( * threadManager, * agentThreadId, * (e) => e.type === 'TOOL_RESULT' && e.data.id === 'call_123', * 'TOOL_RESULT with id=call_123' * ); */ export function waitForEventMatch( threadManager: ThreadManager, threadId: string, predicate: (event: LaceEvent) => boolean, description: string, timeoutMs = 5000 ): Promise<LaceEvent> { return new Promise((resolve, reject) => { const startTime = Date.now(); const check = () => { const events = threadManager.getEvents(threadId); const event = events.find(predicate); if (event) { resolve(event); } else if (Date.now() - startTi