Sentry Setup Ai Monitoring

Name: Sentry Setup Ai Monitoring
Author: getsentry

getsentry/sentry-for-ai

2.5k installs
243 repo stars
Updated July 27, 2026
getsentry/sentry-for-ai

sentry-setup-ai-monitoring configures Sentry AI agent monitoring for detected LLM SDKs with tracing and optional prompt capture.

About

The sentry-setup-ai-monitoring skill configures Sentry to track LLM calls, agent executions, tool usage, and token consumption after detecting installed AI SDKs. Prerequisites require tracing enabled with tracesSampleRate above zero and setting gen_ai.conversation.id for multi-turn chats when integrations do not infer it. A data capture warning mandates explicit user confirmation before sendDefaultPii or recordInputs and recordOutputs because prompts may contain PII regulated under GDPR or CCPA. Detection greps package.json or requirements for OpenAI, Anthropic, Vercel AI, LangChain, Google GenAI, and related packages, then checks sampling configuration and offers tracesSampler patterns when AI spans would be dropped. JavaScript integrations such as openAIIntegration auto-enable on Node.js at Sentry SDK 10.53.0 plus with streamGenAiSpans, while browser and Next.js OpenAI clients need manual instrumentOpenAiClient wrapping. Python integrations auto-enable when packages install at SDK 2.60.0 plus with stream_gen_ai_spans. Manual instrumentation documents gen_ai chat, invoke_agent, execute_tool, and handoff span ops when no SDK is detected.

Requires tracesSampleRate above zero; AI spans need tracing enabled.
Detect AI SDKs in package.json or requirements before configuring integrations.
Prompt capture needs explicit user confirmation due to PII and compliance risk.
Node.js auto-enables OpenAI and LangChain integrations at SDK 10.53.0 plus.
Manual spans use gen_ai chat, invoke_agent, execute_tool, and handoff ops.

Sentry Setup Ai Monitoring by the numbers

2,473 all-time installs (skills.sh)
+51 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #319 of 16,659 AI & Agent Building skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

sentry-setup-ai-monitoring capabilities & compatibility

Capabilities: ai sdk detection for javascript and python depen · auto and manual integration setup for major llm · sampling checks with tracessampler guidance for · pii aware prompt and output capture confirmation · manual gen_ai span types for chat, agents, tools
Works with: openai · anthropic · sentry
Use cases: orchestration · debugging
Pricing: Freemium

npx skills add https://github.com/getsentry/sentry-for-ai --skill sentry-setup-ai-monitoring

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/getsentry/sentry-for-ai/sentry-setup-ai-monitoring.svg)](https://skillselion.com/skills/getsentry/sentry-for-ai/sentry-setup-ai-monitoring)

Installs	2.5k
repo stars	★ 243
Security audit	3 / 3 scanners passed
Last updated	July 27, 2026
Repository	getsentry/sentry-for-ai ↗

How do I monitor OpenAI, Anthropic, or LangChain calls and agent tool usage in Sentry?

Configure Sentry AI agent monitoring for LLM calls, tool usage, token consumption, and conversation grouping across detected JavaScript or Python SDKs.

Who is it for?

Apps using supported AI SDKs that need LLM latency, token, and agent execution visibility.

Skip if: Skip for non-AI error monitoring only; use framework-specific Sentry SDK setup skills.

When should I use this skill?

User asks to monitor LLM calls, track agents, measure token usage, or add AI observability.

What you get

Sentry init with streamGenAiSpans, appropriate integrations, and conversation IDs grouping AI spans.

traces_sampler configuration
gen_ai span sampling rules

By the numbers

Requires @sentry/node >=9.x or sentry-sdk >=2.x
Returns sample rate 1.0 for matched gen_ai spans

Files

SKILL.mdMarkdownGitHub ↗

All Skills > Feature Setup > AI Monitoring

Setup Sentry AI Agent Monitoring

Configure Sentry to track LLM calls, agent executions, tool usage, and token consumption.

Invoke This Skill When

User asks to "monitor AI/LLM calls" or "track OpenAI/Anthropic usage"
User wants "AI observability" or "agent monitoring"
User asks about token usage, model latency, or AI costs

Important: The SDK versions, API names, and code samples below are examples. Always verify against docs.sentry.io before implementing, as APIs and minimum versions may have changed.

Prerequisites

AI monitoring requires tracing enabled (tracesSampleRate > 0).

If the app has multi-turn chats, set a conversation ID by default anywhere it makes sense to identify a chat session. Sentry uses gen_ai.conversation.id to group related AI spans into Conversations. Some integrations infer it automatically, but many setups need to set it explicitly.

Data Capture Warning

Prompt and output recording captures user content that is likely PII. Before enabling send-default-PII (sendDefaultPii: true in JavaScript or send_default_pii=True in Python) or per-integration prompt/output capture (recordInputs/recordOutputs in JS, include_prompts in Python), confirm:

The application's privacy policy permits capturing user prompts and model responses
Captured data complies with applicable regulations (GDPR, CCPA, etc.)
Sentry data retention settings are appropriate for the sensitivity of the data

Ask the user whether they want prompt/output capture enabled. Do not enable prompt/output capture without explicit confirmation. Use tracesSampleRate: 1.0 only in development; in production, use a lower value or a tracesSampler function.

Detection First

Always detect installed AI SDKs before configuring:

# JavaScript
grep -E '"(openai|@anthropic-ai/sdk|ai|@langchain|@google/genai)"' package.json

# Python
grep -E '(openai|anthropic|langchain|huggingface)' requirements.txt pyproject.toml 2>/dev/null

Sampling Check

After detecting AI SDKs, check the current sampling configuration:

# JavaScript
grep -E 'tracesSampleRate|tracesSampler' sentry.*.config.* instrument.* src/instrument.* app/instrument.* 2>/dev/null

# Python
grep -E 'traces_sample_rate|traces_sampler' *.py **/*.py 2>/dev/null

If `tracesSampleRate` / `traces_sample_rate` is below 1.0 AND no `tracesSampler` / `traces_sampler` is configured:

Ask the user:

"Your current sample rate is {rate}. Agent runs are sampled as complete span trees — if the root span is dropped, all child gen_ai spans are lost. For full AI visibility, gen_ai-related transactions should be sampled at 100%. Would you like me to set up a tracesSampler that keeps AI traces at 100% while sampling other traffic at your current rate?"

If user confirms, read ${SKILL_ROOT}/references/sampling.md for implementation patterns.

Supported SDKs

JavaScript

Package	Integration	Min Sentry SDK	Auto?
`openai`	`openAIIntegration()`	10.53.0	Yes
`@anthropic-ai/sdk`	`anthropicAIIntegration()`	10.53.0	Yes
`ai` (Vercel)	`vercelAIIntegration()`	10.53.0	Yes*
`@langchain/*`	`langChainIntegration()`	10.53.0	Yes
`@langchain/langgraph`	`langGraphIntegration()`	10.53.0	Yes
`@google/genai`	`googleGenAIIntegration()`	10.53.0	Yes

*Vercel AI: 10.53.0+ required. Requires experimental_telemetry per-call.

Python

Integrations auto-enable when the AI package is installed — no explicit registration needed:

Package	Auto?	Notes
`openai`	Yes	Includes OpenAI Agents SDK
`anthropic`	Yes
`langchain` / `langgraph`	Yes
`huggingface_hub`	Yes
`google-genai`	Yes
`pydantic-ai`	Yes
`litellm`	No	Requires explicit integration
`mcp` (Model Context Protocol)	Yes

JavaScript Configuration

Node.js — auto-enabled integrations

Just ensure tracing is enabled. Integrations auto-enable when the AI package is installed:

Sentry.init({
  dsn: "YOUR_DSN",
  tracesSampleRate: 1.0, // Lower in production (e.g., 0.1)
  streamGenAiSpans: true, // SDK ≥10.53.0
  // OpenAI, Anthropic, Google GenAI, LangChain integrations auto-enable in Node.js
});

To customize (e.g., enable prompt capture after user confirmation — see Data Capture Warning):

Sentry.init({
  dsn: "YOUR_DSN",
  tracesSampleRate: 1.0,
  streamGenAiSpans: true,
  sendDefaultPii: true,
  integrations: [
    Sentry.openAIIntegration({
      // recordInputs/recordOutputs default to true when sendDefaultPii is true
    }),
  ],
});

Browser / Next.js OpenAI (manual wrapping required)

In browser-side code or Next.js meta-framework apps, auto-instrumentation is not available. Wrap the client manually:

import OpenAI from "openai";
import * as Sentry from "@sentry/nextjs"; // or @sentry/react, @sentry/browser

const openai = Sentry.instrumentOpenAiClient(new OpenAI());
// Use 'openai' client as normal

LangChain / LangGraph (auto-enabled)

Sentry.init({
  dsn: "YOUR_DSN",
  tracesSampleRate: 1.0,
  streamGenAiSpans: true,
  sendDefaultPii: true,
  integrations: [
    Sentry.langChainIntegration(),
    Sentry.langGraphIntegration(),
  ],
});

Vercel AI SDK

Add to sentry.edge.config.ts for Edge runtime:

Sentry.init({
  dsn: "YOUR_DSN",
  tracesSampleRate: 1.0,
  streamGenAiSpans: true,
  sendDefaultPii: true,
  integrations: [Sentry.vercelAIIntegration()],
});

Enable telemetry per-call:

await generateText({
  model: openai("gpt-4o"),
  prompt: "Hello",
  experimental_telemetry: {
    isEnabled: true,
    recordInputs: true,
    recordOutputs: true,
  },
});

Python Configuration

Integrations auto-enable — just init with tracing. Only add explicit imports to customize options:

import sentry_sdk

sentry_sdk.init(
    dsn="YOUR_DSN",
    traces_sample_rate=1.0,  # Lower in production (e.g., 0.1)
    stream_gen_ai_spans=True,  # SDK ≥2.60.0
    send_default_pii=True,
    # Integrations auto-enable when the AI package is installed.
    # Only specify explicitly to customize (e.g., include_prompts):
    # integrations=[OpenAIIntegration(include_prompts=True)],
)

Manual Instrumentation

Use when no supported SDK is detected. Follow the canonical Sentry Conventions for `gen_ai.*` attributes — the JS docs may lag behind; do not set attributes marked deprecated in the conventions.

Span Types

`op`	Span `name` pattern	Purpose
`gen_ai.{operation}` (e.g. `gen_ai.chat`, `gen_ai.request`)	`{operation} {model}` (e.g. `chat gpt-4o`)	Individual LLM call
`gen_ai.invoke_agent`	`invoke_agent {agent_name}`	Agent execution lifecycle
`gen_ai.execute_tool`	`execute_tool {tool_name}`	Tool/function call
`gen_ai.handoff`	`handoff from {source} to {target}`	Agent-to-agent transition

For LLM-call spans, the op follows the pattern gen_ai.{gen_ai.operation.name} — use gen_ai.chat, gen_ai.embeddings, gen_ai.generate_content, or gen_ai.text_completion where the operation is known. Span attributes only accept primitives; arrays/objects must be JSON-stringified.

Example (JavaScript)

const inputMessages = [
  { role: "user", parts: [{ type: "text", content: "Tell me a joke" }] },
];

await Sentry.startSpan({
  op: "gen_ai.chat",
  name: "chat gpt-4o",
  attributes: {
    "gen_ai.request.model": "gpt-4o",
    "gen_ai.operation.name": "chat",
    "gen_ai.input.messages": JSON.stringify(inputMessages),
  },
}, async (span) => {
  const result = await llmClient.complete(inputMessages);

  const outputMessages = [
    {
      role: "assistant",
      parts: [
        // Thinking/reasoning content goes in a `reasoning` part, NOT a `text` part.
        // Sentry surfaces it separately and filters it out of the Conversations view.
        { type: "reasoning", content: result.reasoning },
        { type: "text", content: result.text },
      ],
      finish_reason: result.finishReason,
    },
  ];
  span.setAttribute("gen_ai.output.messages", JSON.stringify(outputMessages));
  span.setAttribute("gen_ai.usage.input_tokens", result.inputTokens);
  span.setAttribute("gen_ai.usage.output_tokens", result.outputTokens);
  return result;
});

Key Attributes

Common (all AI spans):

Attribute	Required	Description
`gen_ai.request.model`	Yes	Model identifier (e.g., `gpt-4o`, `claude-sonnet-4-6`)
`gen_ai.operation.name`	No	Operation label (`chat`, `embeddings`, `invoke_agent`, `execute_tool`, `handoff`, etc.)
`gen_ai.agent.name`	No	Agent name (set on agent and tool spans)

Request / response content (PII — enable only after confirming; see Data Capture Warning above):

Attribute	Description
`gen_ai.input.messages`	JSON-stringified array of input messages. Each item uses `{role, parts}` where `parts` is `[{type, content}]`; `role` is `"user"`, `"assistant"`, `"tool"`, or `"system"`. Common part `type`s: `"text"`, `"reasoning"`, `"tool_call"`, `"tool_call_response"`
`gen_ai.output.messages`	JSON-stringified array of response messages (text + tool calls), same shape as inputs

Thinking / reasoning messages: Models with extended thinking (Anthropic thinking blocks, Gemini thought, DeepSeek reasoning_content) produce internal reasoning that isn't part of the user-visible reply. Represent it as a reasoning part inside the assistant message — {"type": "reasoning", "content": "..."} — alongside the user-facing text part. Sentry surfaces reasoning parts separately and filters them out of the user-facing Conversations view, so do not fold thinking into a text part. When previous thinking is fed back into a multi-turn request, include the same reasoning parts in the assistant messages within gen_ai.input.messages. Record reasoning token counts via gen_ai.usage.output_tokens.reasoning (a subset of gen_ai.usage.output_tokens). | gen_ai.system_instructions | System prompt passed to the model | | gen_ai.tool.definitions | JSON-stringified list of tools available to the model |

Token usage:

Attribute	Description
`gen_ai.usage.input_tokens`	Total input tokens — includes cached tokens
`gen_ai.usage.input_tokens.cached`	Subset of input tokens served from cache
`gen_ai.usage.input_tokens.cache_write`	Tokens written to cache while processing input
`gen_ai.usage.output_tokens`	Total output tokens — includes reasoning tokens
`gen_ai.usage.output_tokens.reasoning`	Subset of output tokens used for reasoning
`gen_ai.usage.total_tokens`	Sum of input + output tokens

Tool spans (`gen_ai.execute_tool`):

Attribute	Description
`gen_ai.tool.name`	Tool identifier
`gen_ai.tool.description`	Human-readable tool description
`gen_ai.tool.call.arguments`	JSON-stringified tool arguments
`gen_ai.tool.call.result`	JSON-stringified tool result

Token Usage and Cost Calculation

Sentry uses token attributes to calculate model costs. Cached and reasoning tokens are subsets, not separate counts — gen_ai.usage.input_tokens already includes gen_ai.usage.input_tokens.cached, and gen_ai.usage.output_tokens already includes gen_ai.usage.output_tokens.reasoning.

Sentry subtracts the cached/reasoning counts from the totals to compute the uncached/non-reasoning portion. Reporting a cached or reasoning count greater than its total produces negative costs in the dashboard.

Example — 100 input tokens total, 90 served from cache:

Correct: input_tokens = 100, input_tokens.cached = 90
Wrong: input_tokens = 10, input_tokens.cached = 90 (cached larger than total → negative cost)

The same rule applies to gen_ai.usage.output_tokens vs. gen_ai.usage.output_tokens.reasoning.

Verification

After configuring, make an LLM call and check the Sentry Traces dashboard. AI spans appear with gen_ai.* operations showing model, token counts, and latency.

Conversations

Conversations gives a readable, chat-style view of past sessions with your AI agent. It groups spans by gen_ai.conversation.id — so whether a user talked across multiple traces or multiple conversations happened inside one trace, you get a timeline of every message, tool call, and response.

When the user asks for AI monitoring setup, proactively mention this requirement if the app has multi-turn chats. Without a conversation ID, the agent-monitoring spans still work, but the Conversations view cannot group the session correctly.

Find it at Explore > Conversations in Sentry.

Prerequisites for Conversations

Tracing enabled with tracesSampleRate > 0
streamGenAiSpans: true (JS SDK >=10.53.0) / stream_gen_ai_spans=True (Python SDK >=2.60.0) — required so AI spans are sent as standalone items. Without this, spans with large inputs/outputs can hit transaction payload size limits and be dropped.
Input and output capture enabled — Conversations reconstructs the chat from gen_ai.input.messages and gen_ai.output.messages attributes. Set sendDefaultPii: true (JS) / send_default_pii=True (Python). Without it, conversations appear empty.

Setting a Conversation ID

Some integrations (OpenAI Agents SDK for Python, OpenAI SDK for Node) infer the conversation ID automatically. For all others, set it manually.

JavaScript

import * as Sentry from "@sentry/node"; // or @sentry/nextjs, @sentry/nestjs, etc.

// Set at the start of a conversation
Sentry.setConversationId("conv_abc123");

// All subsequent AI calls carry gen_ai.conversation.id: "conv_abc123"
await openai.chat.completions.create({
  model: "gpt-5.5",
  messages: [{ role: "user", content: "Hello" }],
});

Python

import sentry_sdk.ai

# Set at the start of a conversation
sentry_sdk.ai.set_conversation_id("conv_abc123")

# All subsequent AI calls carry gen_ai.conversation.id = "conv_abc123"

Some integrations infer the conversation ID automatically. For example, the Python OpenAI integration picks it up when you use the conversation parameter:

import openai
import sentry_sdk

sentry_sdk.init(...)

conversation = openai.conversations.create()
response = openai.responses.create(
    model="gpt-5.4",
    input=[{"role": "user", "content": "What are the 5 Ds of dodgeball?"}],
    conversation=conversation.id  # automatically sets gen_ai.conversation.id
)

Conversations vs Traces

These are independent concepts:

A single conversation can span multiple traces (e.g., user refreshes the page mid-conversation — new trace, same conversation ID)
A single trace can contain spans from different conversations (e.g., user starts a new chat without refreshing)

Troubleshooting

Issue	Solution
AI spans not appearing	Verify `tracesSampleRate > 0`, check SDK version
Token counts missing	Some providers don't return tokens for streaming
Negative or wrong costs in dashboard	Cached/reasoning tokens are subsets of totals — see Token Usage and Cost Calculation
Prompts not captured	Set `sendDefaultPii: true` (JS) or `send_default_pii=True` (Python); use `recordInputs`/`include_prompts` only for explicit overrides
Vercel AI not working	Add `experimental_telemetry` to each call
Conversations view empty	Ensure `streamGenAiSpans: true` / `stream_gen_ai_spans=True`, `sendDefaultPii: true` / `send_default_pii=True`, and a conversation ID is set

Sampling Strategy for AI Agent Spans

@sentry/node >=9.x (inheritOrSampleWith), sentry-sdk >=2.x (traces_sampler)

The Problem

Agent runs are span trees. Sampling decides at the root; children inherit. Drop the root, lose every child span. At any rate below 1.0, you lose entire agent executions.

How It Works

tracesSampler / traces_sampler only fires on root spans. Non-root spans (including gen_ai.* children) inherit unconditionally.

Scenario 1: gen_ai span IS the root (cron, queue consumer, CLI). The sampler sees gen_ai.* directly. Match and return 1.0.

Scenario 2: gen_ai spans are children of HTTP transactions (most web apps). POST /api/chat is sampled before any AI code runs. Solution: sample AI routes at 1.0.

JavaScript

Sentry.init({
  dsn: process.env.SENTRY_DSN,
  tracesSampler: ({ name, attributes, inheritOrSampleWith }) => {
    // Standalone gen_ai root spans
    if (attributes?.['sentry.op']?.startsWith('gen_ai.') || attributes?.['gen_ai.system']) {
      return 1.0;
    }
    // HTTP routes that trigger AI calls
    if (name?.includes('/api/chat') || name?.includes('/api/agent')) {
      return 1.0;
    }
    return inheritOrSampleWith(0.2); // adjust to your baseline
  },
});

Python

def traces_sampler(sampling_context):
    tx = sampling_context.get("transaction_context", {})
    op, name = tx.get("op", ""), tx.get("name", "")

    if op.startswith("gen_ai."):
        return 1.0
    if op == "http.server" and any(p in name for p in ["/api/chat", "/api/agent"]):
        return 1.0

    parent = sampling_context.get("parent_sampled")
    if parent is not None:
        return float(parent)
    return 0.2

sentry_sdk.init(dsn="...", traces_sampler=traces_sampler)

If AI is the core product, skip tracesSampler and use tracesSampleRate: 1.0.

Fallback: Metrics + Logs

If 100% tracing isn't feasible, emit metrics and logs on every LLM call (independent of trace sampling):

# Metrics - 100% coverage of cost/usage/latency
sentry_sdk.metrics.distribution("gen_ai.token_usage", usage.total_tokens,
    attributes={"model": model, "user_id": str(user.id)})
sentry_sdk.metrics.count("gen_ai.calls", 1,
    attributes={"model": model, "status": "error" if error else "success"})

# Logs - 100% searchable per-call records
sentry_sdk.logger.info("LLM call", model=model, input_tokens=usage.prompt_tokens,
    output_tokens=usage.completion_tokens, latency_ms=response_time_ms)

JS equivalent uses Sentry.metrics.* and Sentry.logger.* with the same attribute patterns.

Troubleshooting

Issue	Solution
gen_ai spans missing despite sampler returning 1.0	Parent HTTP transaction was sampled at a lower rate. Add the route to your sampler.
`tracesSampler` not called for gen_ai spans	Expected. It only runs on root spans. Sample the parent HTTP route instead.
All traces at 100%	Check the fallback rate in `inheritOrSampleWith()` / default return value.