
Sentry Instrumentation Guide
Choose whether an observability moment belongs in Sentry errors, traces/spans, metrics, or logs so production AI apps get the right signal without noisy Issues.
Overview
sentry-instrumentation-guide is an agent skill most often used in Operate (also Ship and Build) that explains how to map runtime behavior to Sentry errors, traces, metrics, and logs instead of mis-instrumenting AI servic
Install
npx skills add https://github.com/getsentry/sentry-for-ai --skill sentry-instrumentation-guideWhat is this skill?
- Deep dive on four Sentry signals: errors, traces/spans, metrics, and logs—with when-to-use for each
- Errors become deduplicated Issues with owner workflow; empty error feed when nothing threw despite user pain
- Traces/spans for cross-service waterfalls, slow DB/API/LLM tool timing inside a request
- Tiebreakers and overlap resolution when one event could land in multiple feeds
- Answers why logging everything or one wide event is insufficient versus intentional instrumentation
- Covers four instrumentation signals in depth: errors, traces/spans, metrics, and logs
Adoption & trust: 1 installs on skills.sh; 197 GitHub stars.
What problem does it solve?
Your Sentry project is either silent on real user pain or flooded with Issues because you treat every anomaly as an exception or one mega-event.
Who is it for?
Solo builders operating LLM-backed APIs or multi-service agents who already use Sentry and need a signal-selection playbook beyond the parent skill’s decision table.
Skip if: Greenfield projects with no Sentry SDK yet, or teams wanting copy-paste code without reading signal semantics.
When should I use this skill?
User asks how to instrument Sentry for AI apps, which signal to use, or why errors/traces/metrics/logs overlap.
What do I get? / Deliverables
You instrument exceptions, spans, metrics, and logs in the right feeds so Issues track real breakages and traces show where LLM or API steps lag—without claiming success paths as errors.
- Signal-selection rationale aligned to Sentry Issues and trace workflows
- Instrumentation decisions documented per request path or agent step
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Signal choice matters most once code runs in production, so the canonical shelf is Operate errors—even though instrumentation is added during Ship and refined in Build. The guide centers on the Issues workflow (exceptions, deduplication, assignment) and complementary traces for request-path failures.
Where it fits
Design span boundaries around each agent tool call before merging the inference API.
Pre-launch checklist: verify exceptions open Issues while happy-path empty results stay out of the error feed.
Investigate user complaints with no Issues by reading trace waterfalls for slow external calls.
How it compares
Observability design guide for Sentry signal placement—not a generic debug printf skill or a hosting deploy recipe.
Common Questions / FAQ
Who is sentry-instrumentation-guide for?
Developers shipping AI or API products with Sentry enabled who need to decide what to capture as Issues versus spans, metrics, or structured logs.
When should I use sentry-instrumentation-guide?
During Ship when adding SDK instrumentation; in Operate when triaging empty Issues feeds or missing latency clues; in Build when designing agent tool-call tracing before launch.
Is sentry-instrumentation-guide safe to install?
It is documentation-only instrumentation guidance; review the Security Audits panel on this page for the parent sentry-for-ai package before enabling production DSNs in your agent environment.
SKILL.md
READMESKILL.md - Sentry Instrumentation Guide
# Choosing Signals — Deep Dive The main [SKILL.md](../SKILL.md) gives the decision table and the four tiebreakers. This file is the reasoning behind them: what each signal is *for*, how to resolve the overlaps when the same value could go more than one place, why retention differs per signal, and the answer to "can't I just log everything / emit one wide event and derive the rest?" ## Each Signal, In Depth ### Errors — "What just broke?" A stack trace and an exception type, grouped into an **Issue** that gets deduplicated, assigned, and tracked until it's resolved. The defining trait is the workflow: errors aren't just recorded, they become work items with an owner and a lifecycle. - **Reach for it when:** code threw an exception, or you have a condition serious enough that it *should* halt and be tracked to resolution. - **Workflow it feeds:** the Issues feed — grouping, assignment, regression detection, alerting on new/regressed issues. - **Gotcha:** a successful request is not an error. A query that returns zero rows succeeded. If nothing threw, the error feed will be empty even while users are unhappy — which is exactly the case where you need the other three signals. ### Traces and spans — "Did the request flow the way it was supposed to?" Timed operations nested inside a trace, rendered as a waterfall. This is how you follow a request across services and see the DB query that dragged, the API call that timed out, the LLM tool call that took 8 seconds instead of 200ms. - **Reach for it when:** you want timing, or you want to confirm the request took the path you expected. - **Workflow it feeds:** the trace waterfall — a structured dependency tree with timing on every node. Critically, this is a format a coding agent can reason about directly: it can read the spans, find work that could run in parallel, and rewrite the code. Hand it the same information as a stream of log lines and it has to reconstruct the call graph from timestamps first. - **Gotcha:** most spans are auto-instrumented (framework + DB integrations). You rarely write one by hand — and a clean trace can still hide a quietly wrong outcome. A span tells you the request *flowed* as designed; it can't tell you the design just failed this user. ### Logs — "What was true at this point, and why?" The state of the system at one specific moment, captured as a structured event: config values, feature flags, the inputs and outputs of a function, the user ID. Logs are the trail through a function's decision tree — the markers you drop where the code makes a choice, so a human or an agent can later follow the reasoning. They fill in the *why* once errors and traces have told you *what* broke and *where* the time went. - **Reach for it when:** you need to reconstruct one specific request's decisions after the fact, especially the request from a support ticket. - **Workflow it feeds:** searchable structured records you can pull up by `user_id`, `trace_id`, or any attribute. - **Gotcha:** logs are most valuable when they're **wide** — a structured event packed with context (the flag that was on, the inputs, the outcome), not a bare one-line string. ### Metrics — "How have the key parts behaved over time?" Counters, gauges, and distributions, each kept as an individual measurement you can slice by any attribute and drill from an aggregate back into the samples (and the trace) behind it. Not just "12,000 checkouts this week," but how that splits by region and how the line moved across the last deploy. - **Reach for it when:** you want a rate, a trend, a threshold to alert on, or a number to chart on a dashboard. Metrics are a historical signal as much as a right-now one. - **Workflow it feeds:** charts, dashboards, and alerts — and drill-down from an aggregate into the individual samples behind it. - **Gotcha:** keep attribute cardinality low. High-cardinality attributes (like raw `user_id`) degrade backend performance — that lev