Phoenix Tracing

Name: Phoenix Tracing
Author: github

github/awesome-copilot

984 installs
37.1k repo stars
Updated July 28, 2026
github/awesome-copilot

phoenix-tracing is a skill for Phoenix OpenInference LLM tracing instrumentation.

About

The phoenix-tracing skill documents OpenInference semantic conventions and instrumentation for Phoenix AI observability across Python and TypeScript. Reference categories cover setup, auto and manual instrumentation, nine span kinds, projects, sessions, production batching, and feedback annotations. Agents start with setup-python or setup-typescript guides then add custom spans following OpenInference attributes. Compatibility requires a Phoenix server plus arize-phoenix-otel or @arizeai/phoenix-otel packages depending on language. OpenInference conventions for Phoenix LLM tracing. Python and TypeScript setup and instrumentation references. Nine span kinds with attribute conventions. Production batching, masking, and feedback annotations. Requires Phoenix server and language-specific otel packages. Implement Phoenix OpenInference LLM tracing with setup, spans, and production deployment guides.

OpenInference conventions for Phoenix LLM tracing.
Python and TypeScript setup and instrumentation references.
Nine span kinds with attribute conventions.
Production batching, masking, and feedback annotations.
Requires Phoenix server and language-specific otel packages.

Phoenix Tracing by the numbers

984 all-time installs (skills.sh)
+22 installs in the week ending Jul 28, 2026 (Skillselion tracking)
Ranked #1,061 of 16,659 AI & Agent Building skills by installs in the Skillselion catalog
Security screen: MEDIUM risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

From the docs

What phoenix-tracing says it does

OpenInference semantic conventions and instrumentation for Phoenix AI observability.

SKILL.md

npx skills add https://github.com/github/awesome-copilot --skill phoenix-tracing

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/github/awesome-copilot/phoenix-tracing.svg)](https://skillselion.com/skills/github/awesome-copilot/phoenix-tracing)

Installs	984
repo stars	★ 37.1k
Security audit	2 / 3 scanners passed
Last updated	July 28, 2026
Repository	github/awesome-copilot ↗

How do I set up Phoenix tracing with correct OpenInference spans?

Implement Phoenix OpenInference LLM tracing with setup, spans, and production deployment guides.

Who is it for?

Teams implementing LLM observability with Phoenix and OpenInference.

Skip if: Skip when only Phoenix CLI debugging without new instrumentation.

When should I use this skill?

User implements Phoenix tracing, OpenInference spans, or LLM observability.

What you get

Instrumented LLM app with Phoenix tracing per language setup references.

Instrumented LLM spans
OpenInference attribute map
Phoenix-exportable trace configuration

Files

references/

SKILL.mdMarkdownGitHub ↗

Phoenix Tracing

Comprehensive guide for instrumenting LLM applications with OpenInference tracing in Phoenix. Contains reference files covering setup, instrumentation, span types, and production deployment.

When to Apply

Reference these guidelines when:

Setting up Phoenix tracing (Python or TypeScript)
Creating custom spans for LLM operations
Adding attributes following OpenInference conventions
Deploying tracing to production
Querying and analyzing trace data

Reference Categories

Priority	Category	Description	Prefix
1	Setup	Installation and configuration	`setup-*`
2	Instrumentation	Auto and manual tracing	`instrumentation-*`
3	Span Types	9 span kinds with attributes	`span-*`
4	Organization	Projects and sessions	`projects-`, `sessions-`
5	Enrichment	Custom metadata	`metadata-*`
6	Production	Batch processing, masking	`production-*`
7	Feedback	Annotations and evaluation	`annotations-*`

Quick Reference

1. Setup (START HERE)

setup-python - Install arize-phoenix-otel, configure endpoint
setup-typescript - Install @arizeai/phoenix-otel, configure endpoint

2. Instrumentation

instrumentation-auto-python - Auto-instrument OpenAI, LangChain, etc.
instrumentation-auto-typescript - Auto-instrument supported frameworks
instrumentation-manual-python - Custom spans with decorators
instrumentation-manual-typescript - Custom spans with wrappers

3. Span Types (with full attribute schemas)

span-llm - LLM API calls (model, tokens, messages, cost)
span-chain - Multi-step workflows and pipelines
span-retriever - Document retrieval (documents, scores)
span-tool - Function/API calls (name, parameters)
span-agent - Multi-step reasoning agents
span-embedding - Vector generation
span-reranker - Document re-ranking
span-guardrail - Safety checks
span-evaluator - LLM evaluation

4. Organization

projects-python / projects-typescript - Group traces by application
sessions-python / sessions-typescript - Track conversations

5. Enrichment

metadata-python / metadata-typescript - Custom attributes

6. Production (CRITICAL)

production-python / production-typescript - Batch processing, PII masking

7. Feedback

annotations-overview - Feedback concepts
annotations-python / annotations-typescript - Add feedback to spans

Reference Files

fundamentals-overview - Traces, spans, attributes basics
fundamentals-required-attributes - Required fields per span type
fundamentals-universal-attributes - Common attributes (user.id, session.id)
fundamentals-flattening - JSON flattening rules
attributes-messages - Chat message format
attributes-metadata - Custom metadata schema
attributes-graph - Agent workflow attributes
attributes-exceptions - Error tracking

Common Workflows

Quick Start: setup-{lang} → instrumentation-auto-{lang} → Check Phoenix
Custom Spans: setup-{lang} → instrumentation-manual-{lang} → span-{type}
Session Tracking: sessions-{lang} for conversation grouping patterns
Production: production-{lang} for batching, masking, and deployment

How to Use This Skill

Navigation Patterns:

# By category prefix
references/setup-*              # Installation and configuration
references/instrumentation-*    # Auto and manual tracing
references/span-*               # Span type specifications
references/sessions-*           # Session tracking
references/production-*         # Production deployment
references/fundamentals-*       # Core concepts
references/attributes-*         # Attribute specifications

# By language
references/*-python.md          # Python implementations
references/*-typescript.md      # TypeScript implementations

Reading Order: 1. Start with setup-{lang} for your language 2. Choose instrumentation-auto-{lang} OR instrumentation-manual-{lang} 3. Reference span-{type} files as needed for specific operations 4. See fundamentals-* files for attribute specifications

References

Phoenix Documentation:

Python API Documentation:

Python OTEL Package - arize-phoenix-otel API reference
Python Client Package - arize-phoenix-client API reference

TypeScript API Documentation:

TypeScript Packages - @arizeai/phoenix-otel, @arizeai/phoenix-client, and other TypeScript packages

Annotations Overview

Annotations allow you to add human or automated feedback to traces, spans, documents, and sessions. Annotations are essential for evaluation, quality assessment, and building training datasets.

Annotation Types

Phoenix supports four types of annotations:

Type	Target	Purpose	Example Use Case
Span Annotation	Individual span	Feedback on a specific operation	"This LLM response was accurate"
Document Annotation	Document within a RETRIEVER span	Feedback on retrieved document relevance	"This document was not helpful"
Trace Annotation	Entire trace	Feedback on end-to-end interaction	"User was satisfied with result"
Session Annotation	User session	Feedback on multi-turn conversation	"Session ended successfully"

Annotation Fields

Every annotation has these fields:

Required Fields

Field	Type	Description
Entity ID	String	ID of the target entity (span_id, trace_id, session_id, or document_position)
`name`	String	Annotation name/label (e.g., "quality", "relevance", "helpfulness")

Result Fields (At Least One Required)

Field	Type	Description
`label`	String (optional)	Categorical value (e.g., "good", "bad", "relevant", "irrelevant")
`score`	Float (optional)	Numeric value (typically 0-1, but can be any range)
`explanation`	String (optional)	Free-text explanation of the annotation

At least one of label, score, or explanation must be provided.

Optional Fields

Field	Type	Description
`annotator_kind`	String	Who created this annotation: "HUMAN", "LLM", or "CODE" (default: "HUMAN")
`identifier`	String	Unique identifier for upsert behavior (updates existing if same name+entity+identifier)
`metadata`	Object	Custom metadata as key-value pairs

Annotator Kinds

Kind	Description	Example
`HUMAN`	Manual feedback from a person	User ratings, expert labels
`LLM`	Automated feedback from an LLM	GPT-4 evaluating response quality
`CODE`	Automated feedback from code	Rule-based checks, heuristics

Examples

Quality Assessment:

quality - Overall quality (label: good/fair/poor, score: 0-1)
correctness - Factual accuracy (label: correct/incorrect, score: 0-1)
helpfulness - User satisfaction (label: helpful/not_helpful, score: 0-1)

RAG-Specific:

relevance - Document relevance to query (label: relevant/irrelevant, score: 0-1)
faithfulness - Answer grounded in context (label: faithful/unfaithful, score: 0-1)

Safety:

toxicity - Contains harmful content (score: 0-1)
pii_detected - Contains personally identifiable information (label: yes/no)

Python SDK Annotation Patterns

Add feedback to spans, traces, documents, and sessions using the Python client.

Client Setup

from phoenix.client import Client
client = Client()  # Default: http://localhost:6006

Span Annotations

Add feedback to individual spans:

client.spans.add_span_annotation(
    span_id="abc123",
    annotation_name="quality",
    annotator_kind="HUMAN",
    label="high_quality",
    score=0.95,
    explanation="Accurate and well-formatted",
    metadata={"reviewer": "alice"},
    sync=True
)

Document Annotations

Rate individual documents in RETRIEVER spans:

client.spans.add_document_annotation(
    span_id="retriever_span",
    document_position=0,  # 0-based index
    annotation_name="relevance",
    annotator_kind="LLM",
    label="relevant",
    score=0.95
)

Trace Annotations

Feedback on entire traces:

client.traces.add_trace_annotation(
    trace_id="trace_abc",
    annotation_name="correctness",
    annotator_kind="HUMAN",
    label="correct",
    score=1.0
)

Span Notes

Notes are a special type of annotation for free-form text — useful for open coding, where reviewers leave qualitative observations on a span before any rubric exists. Later, those notes can be aggregated and distilled into structured labels or scores.

Notes are append-only: each call auto-generates a UUIDv4 identifier, so multiple notes naturally accumulate on the same span. Structured annotations are keyed by (name, span_id, identifier) — you can have many same-named annotations on one span by supplying distinct identifiers (e.g. one per reviewer); writing the same (name, span_id, identifier) overwrites the existing entry.

client.spans.add_span_note(
    span_id="abc123def456",
    note="Unexpected token in response, needs review",
)

Session Annotations

Feedback on multi-turn conversations:

client.sessions.add_session_annotation(
    session_id="session_xyz",
    annotation_name="user_satisfaction",
    annotator_kind="HUMAN",
    label="satisfied",
    score=0.85
)

RAG Pipeline Example

from phoenix.client import Client
from phoenix.client.resources.spans import SpanDocumentAnnotationData

client = Client()

# Document relevance (batch)
client.spans.log_document_annotations(
    document_annotations=[
        SpanDocumentAnnotationData(
            name="relevance", span_id="retriever_span", document_position=i,
            annotator_kind="LLM", result={"label": label, "score": score}
        )
        for i, (label, score) in enumerate([
            ("relevant", 0.95), ("relevant", 0.80), ("irrelevant", 0.10)
        ])
    ]
)

# LLM response quality
client.spans.add_span_annotation(
    span_id="llm_span",
    annotation_name="faithfulness",
    annotator_kind="LLM",
    label="faithful",
    score=0.90
)

# Overall trace quality
client.traces.add_trace_annotation(
    trace_id="trace_123",
    annotation_name="correctness",
    annotator_kind="HUMAN",
    label="correct",
    score=1.0
)

API Reference

Python Client API

TypeScript SDK Annotation Patterns

Add feedback to spans, traces, documents, and sessions using the TypeScript client.

Client Setup

import { createClient } from "@arizeai/phoenix-client";
const client = createClient();  // Default: http://localhost:6006

Span Annotations

Add feedback to individual spans:

import { addSpanAnnotation } from "@arizeai/phoenix-client/spans";

await addSpanAnnotation({
  client,
  spanAnnotation: {
    spanId: "abc123",
    name: "quality",
    annotatorKind: "HUMAN",
    label: "high_quality",
    score: 0.95,
    explanation: "Accurate and well-formatted",
    metadata: { reviewer: "alice" }
  },
  sync: true
});

Span Notes

Notes are append-only: each call auto-generates a UUIDv4 identifier, so multiple notes naturally accumulate on the same span. Structured annotations are keyed by (name, spanId, identifier) — you can have many same-named annotations on one span by supplying distinct identifiers (e.g. one per reviewer); writing the same (name, spanId, identifier) overwrites the existing entry.

import { addSpanNote } from "@arizeai/phoenix-client/spans";

await addSpanNote({
  client,
  spanNote: {
    spanId: "abc123",
    note: "This span shows unexpected behavior, needs review"
  }
});

Document Annotations

Rate individual documents in RETRIEVER spans:

import { addDocumentAnnotation } from "@arizeai/phoenix-client/spans";

await addDocumentAnnotation({
  client,
  documentAnnotation: {
    spanId: "retriever_span",
    documentPosition: 0,  // 0-based index
    name: "relevance",
    annotatorKind: "LLM",
    label: "relevant",
    score: 0.95
  }
});

Trace Annotations

Feedback on entire traces:

import { addTraceAnnotation } from "@arizeai/phoenix-client/traces";

await addTraceAnnotation({
  client,
  traceAnnotation: {
    traceId: "trace_abc",
    name: "correctness",
    annotatorKind: "HUMAN",
    label: "correct",
    score: 1.0
  }
});

Trace Notes

Notes on entire traces (multiple notes allowed per trace):

import { addTraceNote } from "@arizeai/phoenix-client/traces";

await addTraceNote({
  client,
  traceNote: {
    traceId: "abc123def456",
    note: "Needs follow-up — unexpected tool call sequence"
  }
});

Session Annotations

Feedback on multi-turn conversations:

import { addSessionAnnotation } from "@arizeai/phoenix-client/sessions";

await addSessionAnnotation({
  client,
  sessionAnnotation: {
    sessionId: "session_xyz",
    name: "user_satisfaction",
    annotatorKind: "HUMAN",
    label: "satisfied",
    score: 0.85
  }
});

RAG Pipeline Example

import { createClient } from "@arizeai/phoenix-client";
import { logDocumentAnnotations, addSpanAnnotation } from "@arizeai/phoenix-client/spans";
import { addTraceAnnotation } from "@arizeai/phoenix-client/traces";

const client = createClient();

// Document relevance (batch)
await logDocumentAnnotations({
  client,
  documentAnnotations: [
    { spanId: "retriever_span", documentPosition: 0, name: "relevance",
      annotatorKind: "LLM", label: "relevant", score: 0.95 },
    { spanId: "retriever_span", documentPosition: 1, name: "relevance",
      annotatorKind: "LLM", label: "relevant", score: 0.80 }
  ]
});

// LLM response quality
await addSpanAnnotation({
  client,
  spanAnnotation: {
    spanId: "llm_span",
    name: "faithfulness",
    annotatorKind: "LLM",
    label: "faithful",
    score: 0.90
  }
});

// Overall trace quality
await addTraceAnnotation({
  client,
  traceAnnotation: {
    traceId: "trace_123",
    name: "correctness",
    annotatorKind: "HUMAN",
    label: "correct",
    score: 1.0
  }
});

API Reference

TypeScript Client API

Flattening Convention

OpenInference flattens nested data structures into dot-notation attributes for database compatibility, OpenTelemetry compatibility, and simple querying.

Flattening Rules

Objects → Dot Notation

{ llm: { model_name: "gpt-4", token_count: { prompt: 10, completion: 20 } } }
// becomes
{ "llm.model_name": "gpt-4", "llm.token_count.prompt": 10, "llm.token_count.completion": 20 }

Arrays → Zero-Indexed Notation

{ llm: { input_messages: [{ role: "user", content: "Hi" }] } }
// becomes
{ "llm.input_messages.0.message.role": "user", "llm.input_messages.0.message.content": "Hi" }

Message Convention: `.message.` segment required

llm.input_messages.{index}.message.{field}
llm.input_messages.0.message.tool_calls.0.tool_call.function.name

Complete Example

// Original
{
  openinference: { span: { kind: "LLM" } },
  llm: {
    model_name: "claude-3-5-sonnet-20241022",
    invocation_parameters: { temperature: 0.7, max_tokens: 1000 },
    input_messages: [{ role: "user", content: "Tell me a joke" }],
    output_messages: [{ role: "assistant", content: "Why did the chicken cross the road?" }],
    token_count: { prompt: 5, completion: 10, total: 15 }
  }
}

// Flattened (stored in Phoenix spans.attributes JSONB)
{
  "openinference.span.kind": "LLM",
  "llm.model_name": "claude-3-5-sonnet-20241022",
  "llm.invocation_parameters": "{\"temperature\": 0.7, \"max_tokens\": 1000}",
  "llm.input_messages.0.message.role": "user",
  "llm.input_messages.0.message.content": "Tell me a joke",
  "llm.output_messages.0.message.role": "assistant",
  "llm.output_messages.0.message.content": "Why did the chicken cross the road?",
  "llm.token_count.prompt": 5,
  "llm.token_count.completion": 10,
  "llm.token_count.total": 15
}

Overview and Traces & Spans

This document covers the fundamental concepts of OpenInference traces and spans in Phoenix.

Overview

OpenInference is a set of semantic conventions for AI and LLM applications based on OpenTelemetry. Phoenix uses these conventions to capture, store, and analyze traces from AI applications.

Key Concepts:

Traces represent end-to-end requests through your application
Spans represent individual operations within a trace (LLM calls, retrievals, tool invocations)
Attributes are key-value pairs attached to spans using flattened, dot-notation paths
Span Kinds categorize the type of operation (LLM, RETRIEVER, TOOL, etc.)

Traces and Spans

Trace Hierarchy

A trace is a tree of spans representing a complete request:

Trace ID: abc123
├─ Span 1: CHAIN (root span, parent_id = null)
│  ├─ Span 2: RETRIEVER (parent_id = span_1_id)
│  │  └─ Span 3: EMBEDDING (parent_id = span_2_id)
│  └─ Span 4: LLM (parent_id = span_1_id)
│     └─ Span 5: TOOL (parent_id = span_4_id)

Context Propagation

Spans maintain parent-child relationships via:

trace_id - Same for all spans in a trace
span_id - Unique identifier for this span
parent_id - References parent span's span_id (null for root spans)

Phoenix uses these relationships to:

Build the span tree visualization in the UI
Calculate cumulative metrics (tokens, errors) up the tree
Enable nested querying (e.g., "find CHAIN spans containing LLM spans with errors")

Span Lifecycle

Each span has:

start_time - When the operation began (Unix timestamp in nanoseconds)
end_time - When the operation completed
status_code - OK, ERROR, or UNSET
status_message - Optional error message
attributes - object with all semantic convention attributes

Required and Recommended Attributes

This document covers the required attribute and highly recommended attributes for all OpenInference spans.

Required Attribute

Every span MUST have exactly one required attribute:

{
  "openinference.span.kind": "LLM"
}

Highly Recommended Attributes

While not strictly required, these attributes are highly recommended on all spans as they:

Enable evaluation and quality assessment
Help understand information flow through your application
Make traces more useful for debugging

Input/Output Values

Attribute	Type	Description
`input.value`	String	Input to the operation (prompt, query, document)
`output.value`	String	Output from the operation (response, result, answer)

Example:

{
  "openinference.span.kind": "LLM",
  "input.value": "What is the capital of France?",
  "output.value": "The capital of France is Paris."
}

Why these matter:

Evaluations: Many evaluators (faithfulness, relevance, hallucination detection) require both input and output to assess quality
Information flow: Seeing inputs/outputs makes it easy to trace how data transforms through your application
Debugging: When something goes wrong, having the actual input/output makes root cause analysis much faster
Analytics: Enables pattern analysis across similar inputs or outputs

Phoenix Behavior:

Input/output displayed prominently in span details
Evaluators can automatically access these values
Search/filter traces by input or output content
Export inputs/outputs for fine-tuning datasets

Valid Span Kinds

There are exactly 9 valid span kinds in OpenInference:

Span Kind	Purpose	Common Use Case
`LLM`	Language model inference	OpenAI, Anthropic, local LLM calls
`EMBEDDING`	Vector generation	Text-to-vector conversion
`CHAIN`	Application flow orchestration	LangChain chains, custom workflows
`RETRIEVER`	Document/context retrieval	Vector DB queries, semantic search
`RERANKER`	Result reordering	Rerank retrieved documents
`TOOL`	External tool invocation	API calls, function execution
`AGENT`	Autonomous reasoning	ReAct agents, planning loops
`GUARDRAIL`	Safety/policy checks	Content moderation, PII detection
`EVALUATOR`	Quality assessment	Answer relevance, faithfulness scoring

Universal Attributes

This document covers attributes that can be used on any span kind in OpenInference.

Overview

These attributes can be used on any span kind to provide additional context, tracking, and metadata.

Input/Output

Attribute	Type	Description
`input.value`	String	Input to the operation (prompt, query, document)
`input.mime_type`	String	MIME type (e.g., "text/plain", "application/json")
`output.value`	String	Output from the operation (response, vector, result)
`output.mime_type`	String	MIME type of output

Why Capture I/O?

Always capture input/output for evaluation-ready spans:

Phoenix evaluators (faithfulness, relevance, Q&A correctness) require input.value and output.value
Phoenix UI displays I/O prominently in trace views for debugging
Enables exporting I/O for creating fine-tuning datasets
Provides complete context for analyzing agent behavior

Example attributes:

{
  "openinference.span.kind": "CHAIN",
  "input.value": "What is the weather?",
  "input.mime_type": "text/plain",
  "output.value": "I don't have access to weather data.",
  "output.mime_type": "text/plain"
}

See language-specific implementation:

TypeScript: instrumentation-manual-typescript.md
Python: instrumentation-manual-python.md

Session and User Tracking

Attribute	Type	Description
`session.id`	String	Session identifier for grouping related traces
`user.id`	String	User identifier for per-user analysis

Example:

{
  "openinference.span.kind": "LLM",
  "session.id": "session_abc123",
  "user.id": "user_xyz789"
}

Metadata

Attribute	Type	Description
`metadata`	string	JSON-serialized object of key-value pairs

Example:

{
  "openinference.span.kind": "LLM",
  "metadata": "{\"environment\": \"production\", \"model_version\": \"v2.1\", \"cost_center\": \"engineering\"}"
}

Phoenix Tracing: Auto-Instrumentation (Python)

Automatically create spans for LLM calls without code changes.

Overview

Auto-instrumentation patches supported libraries at runtime to create spans automatically. Use for supported frameworks (LangChain, LlamaIndex, OpenAI SDK, etc.). For custom logic, manual-instrumentation-python.md.

Supported Frameworks

Python:

LLM SDKs: OpenAI, Anthropic, Bedrock, Mistral, Vertex AI, Groq, Ollama
Frameworks: LangChain, LlamaIndex, DSPy, CrewAI, Instructor, Haystack
Install: pip install openinference-instrumentation-{name}

Setup

Install and enable:

pip install arize-phoenix-otel
pip install openinference-instrumentation-openai  # Add others as needed

from phoenix.otel import register

register(project_name="my-app", auto_instrument=True)  # Discovers all installed instrumentors

Example:

from phoenix.otel import register
from openai import OpenAI

register(project_name="my-app", auto_instrument=True)

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

Traces appear in Phoenix UI with model, input/output, tokens, timing automatically captured. See span kind files for full attribute schemas.

Selective instrumentation (explicit control):

from phoenix.otel import register
from openinference.instrumentation.openai import OpenAIInstrumentor

tracer_provider = register(project_name="my-app")  # No auto_instrument
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

Limitations

Auto-instrumentation does NOT capture:

Custom business logic
Internal function calls

Example:

def my_custom_workflow(query: str) -> str:
    preprocessed = preprocess(query)  # Not traced
    response = client.chat.completions.create(...)  # Traced (auto)
    postprocessed = postprocess(response)  # Not traced
    return postprocessed

Solution: Add manual instrumentation:

@tracer.chain
def my_custom_workflow(query: str) -> str:
    preprocessed = preprocess(query)
    response = client.chat.completions.create(...)
    postprocessed = postprocess(response)
    return postprocessed

Auto-Instrumentation (TypeScript)

Automatically create spans for LLM calls without code changes.

Supported Frameworks

LLM SDKs: OpenAI
Frameworks: LangChain
Install: npm install @arizeai/openinference-instrumentation-{name}

Setup

CommonJS (automatic):

const { register } = require("@arizeai/phoenix-otel");
const OpenAI = require("openai");

register({ projectName: "my-app" });

const client = new OpenAI();

ESM (manual required):

import { register, registerInstrumentations } from "@arizeai/phoenix-otel";
import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
import OpenAI from "openai";

register({ projectName: "my-app" });

const instrumentation = new OpenAIInstrumentation();
instrumentation.manuallyInstrument(OpenAI);
registerInstrumentations({ instrumentations: [instrumentation] });

Why: ESM imports are hoisted before register() runs.

Limitations

What auto-instrumentation does NOT capture:

async function myWorkflow(query: string): Promise<string> {
  const preprocessed = await preprocess(query);        // Not traced
  const response = await client.chat.completions.create(...);  // Traced (auto)
  const postprocessed = await postprocess(response);   // Not traced
  return postprocessed;
}

Solution: Add manual instrumentation for custom logic:

import { traceChain } from "@arizeai/openinference-core";

const myWorkflow = traceChain(
  async (query: string): Promise<string> => {
    const preprocessed = await preprocess(query);
    const response = await client.chat.completions.create(...);
    const postprocessed = await postprocess(response);
    return postprocessed;
  },
  { name: "my-workflow" }
);

Combining Auto + Manual

import { register } from "@arizeai/phoenix-otel";
import { traceChain } from "@arizeai/openinference-core";

register({ projectName: "my-app" });

const client = new OpenAI();

const workflow = traceChain(
  async (query: string) => {
    const preprocessed = await preprocess(query);
    const response = await client.chat.completions.create(...);  // Auto-instrumented
    return postprocess(response);
  },
  { name: "my-workflow" }
);

Manual Instrumentation (Python)

Add custom spans using decorators or context managers for fine-grained tracing control.

Setup

pip install arize-phoenix-otel

from phoenix.otel import register
tracer_provider = register(project_name="my-app")
tracer = tracer_provider.get_tracer(__name__)

Quick Reference

Span Kind	Decorator	Use Case
CHAIN	`@tracer.chain`	Orchestration, workflows, pipelines
RETRIEVER	`@tracer.retriever`	Vector search, document retrieval
TOOL	`@tracer.tool`	External API calls, function execution
AGENT	`@tracer.agent`	Multi-step reasoning, planning
LLM	`@tracer.llm`	LLM API calls (manual only)
EMBEDDING	`@tracer.embedding`	Embedding generation
RERANKER	`@tracer.reranker`	Document re-ranking
GUARDRAIL	`@tracer.guardrail`	Safety checks, content moderation
EVALUATOR	`@tracer.evaluator`	LLM evaluation, quality checks

Decorator Approach (Recommended)

Use for: Full function instrumentation, automatic I/O capture

@tracer.chain
def rag_pipeline(query: str) -> str:
    docs = retrieve_documents(query)
    ranked = rerank(docs, query)
    return generate_response(ranked, query)

@tracer.retriever
def retrieve_documents(query: str) -> list[dict]:
    results = vector_db.search(query, top_k=5)
    return [{"content": doc.text, "score": doc.score} for doc in results]

@tracer.tool
def get_weather(city: str) -> str:
    response = requests.get(f"https://api.weather.com/{city}")
    return response.json()["weather"]

Custom span names:

@tracer.chain(name="rag-pipeline-v2")
def my_workflow(query: str) -> str:
    return process(query)

Context Manager Approach

Use for: Partial function instrumentation, custom attributes, dynamic control

from opentelemetry.trace import Status, StatusCode
import json

def retrieve_with_metadata(query: str):
    with tracer.start_as_current_span(
        "vector_search",
        openinference_span_kind="retriever"
    ) as span:
        span.set_attribute("input.value", query)

        results = vector_db.search(query, top_k=5)

        documents = [
            {
                "document.id": doc.id,
                "document.content": doc.text,
                "document.score": doc.score
            }
            for doc in results
        ]
        span.set_attribute("retrieval.documents", json.dumps(documents))
        span.set_status(Status(StatusCode.OK))

        return documents

Capturing Input/Output

Always capture I/O for evaluation-ready spans.

Automatic I/O Capture (Decorators)

Decorators automatically capture input arguments and return values:

```python theme={null} @tracer.chain def handle_query(user_input: str) -> str: result = agent.generate(user_input) return result.text

Automatically captures:

- input.value: user_input

- output.value: result.text

- input.mime_type / output.mime_type: auto-detected


### Manual I/O Capture (Context Manager)

Use `set_input()` and `set_output()` for simple I/O capture:

from opentelemetry.trace import Status, StatusCode

def handle_query(user_input: str) -> str: with tracer.start_as_current_span( "query.handler", openinference_span_kind="chain" ) as span: span.set_input(user_input)

result = agent.generate(user_input)

span.set_output(result.text) span.set_status(Status(StatusCode.OK))

return result.text


**What gets captured:**

{ "input.value": "What is 2+2?", "input.mime_type": "text/plain", "output.value": "2+2 equals 4.", "output.mime_type": "text/plain" }


**Why this matters:**
- Phoenix evaluators require `input.value` and `output.value`
- Phoenix UI displays I/O prominently for debugging
- Enables exporting data for fine-tuning datasets

### Custom I/O with Additional Metadata

Use `set_attribute()` for custom attributes alongside I/O:

def process_query(query: str): with tracer.start_as_current_span( "query.process", openinference_span_kind="chain" ) as span:

Standard I/O

span.set_input(query)

Custom metadata

span.set_attribute("input.length", len(query))

result = llm.generate(query)

Standard output

span.set_output(result.text)

Custom metadata

span.set_attribute("output.tokens", result.usage.total_tokens) span.set_status(Status(StatusCode.OK))

return result


## See Also

- **Span attributes:** `span-chain.md`, `span-retriever.md`, `span-tool.md`, `span-llm.md`, `span-agent.md`, `span-embedding.md`, `span-reranker.md`, `span-guardrail.md`, `span-evaluator.md`
- **Auto-instrumentation:** `instrumentation-auto-python.md` for framework integrations
- **API docs:** https://docs.arize.com/phoenix/tracing/manual-instrumentation

Manual Instrumentation (TypeScript)

Add custom spans using convenience wrappers or withSpan for fine-grained tracing control.

Setup

npm install @arizeai/phoenix-otel @arizeai/openinference-core

import { register } from "@arizeai/phoenix-otel";
register({ projectName: "my-app" });

Quick Reference

Span Kind	Method	Use Case
CHAIN	`traceChain`	Workflows, pipelines, orchestration
AGENT	`traceAgent`	Multi-step reasoning, planning
TOOL	`traceTool`	External APIs, function calls
RETRIEVER	`withSpan`	Vector search, document retrieval
LLM	`withSpan`	LLM API calls (prefer auto-instrumentation)
EMBEDDING	`withSpan`	Embedding generation
RERANKER	`withSpan`	Document re-ranking
GUARDRAIL	`withSpan`	Safety checks, content moderation
EVALUATOR	`withSpan`	LLM evaluation

Convenience Wrappers

import { traceChain, traceAgent, traceTool } from "@arizeai/openinference-core";

// CHAIN - workflows
const pipeline = traceChain(
  async (query: string) => {
    const docs = await retrieve(query);
    return await generate(docs, query);
  },
  { name: "rag-pipeline" }
);

// AGENT - reasoning
const agent = traceAgent(
  async (question: string) => {
    const thought = await llm.generate(`Think: ${question}`);
    return await processThought(thought);
  },
  { name: "my-agent" }
);

// TOOL - function calls
const getWeather = traceTool(
  async (city: string) => fetch(`/api/weather/${city}`).then(r => r.json()),
  { name: "get-weather" }
);

withSpan for Other Kinds

import { withSpan, getInputAttributes, getRetrieverAttributes } from "@arizeai/openinference-core";

// RETRIEVER with custom attributes
const retrieve = withSpan(
  async (query: string) => {
    const results = await vectorDb.search(query, { topK: 5 });
    return results.map(doc => ({ content: doc.text, score: doc.score }));
  },
  {
    kind: "RETRIEVER",
    name: "vector-search",
    processInput: (query) => getInputAttributes(query),
    processOutput: (docs) => getRetrieverAttributes({ documents: docs })
  }
);

Options:

withSpan(fn, {
  kind: "RETRIEVER",              // OpenInference span kind
  name: "span-name",              // Span name (defaults to function name)
  processInput: (args) => {},     // Transform input to attributes
  processOutput: (result) => {},  // Transform output to attributes
  attributes: { key: "value" }    // Static attributes
});

Capturing Input/Output

Always capture I/O for evaluation-ready spans. Use getInputAttributes and getOutputAttributes helpers for automatic MIME type detection:

import {
  getInputAttributes,
  getOutputAttributes,
  withSpan,
} from "@arizeai/openinference-core";

const handleQuery = withSpan(
  async (userInput: string) => {
    const result = await agent.generate({ prompt: userInput });
    return result;
  },
  {
    name: "query.handler",
    kind: "CHAIN",
    // Use helpers - automatic MIME type detection
    processInput: (input) => getInputAttributes(input),
    processOutput: (result) => getOutputAttributes(result.text),
  }
);

await handleQuery("What is 2+2?");

What gets captured:

{
  "input.value": "What is 2+2?",
  "input.mime_type": "text/plain",
  "output.value": "2+2 equals 4.",
  "output.mime_type": "text/plain"
}

Helper behavior:

Strings → text/plain
Objects/Arrays → application/json (automatically serialized)
undefined/null → No attributes set

Why this matters:

Phoenix evaluators require input.value and output.value
Phoenix UI displays I/O prominently for debugging
Enables exporting data for fine-tuning datasets

Custom I/O Processing

Add custom metadata alongside standard I/O attributes:

const processWithMetadata = withSpan(
  async (query: string) => {
    const result = await llm.generate(query);
    return result;
  },
  {
    name: "query.process",
    kind: "CHAIN",
    processInput: (query) => ({
      "input.value": query,
      "input.mime_type": "text/plain",
      "input.length": query.length,  // Custom attribute
    }),
    processOutput: (result) => ({
      "output.value": result.text,
      "output.mime_type": "text/plain",
      "output.tokens": result.usage?.totalTokens,  // Custom attribute
    }),
  }
);

Phoenix Tracing: Custom Metadata (Python)

Add custom attributes to spans for richer observability.

Install

pip install arize-phoenix-otel  # context managers and SpanAttributes re-exported since 0.16.0

Session

from phoenix.otel import using_session

with using_session(session_id="my-session-id"):
    # Spans get: "session.id" = "my-session-id"
    ...

User

from phoenix.otel import using_user

with using_user("my-user-id"):
    # Spans get: "user.id" = "my-user-id"
    ...

Metadata

from phoenix.otel import using_metadata

with using_metadata({"key": "value", "experiment_id": "exp_123"}):
    # Spans get: "metadata" = '{"key": "value", "experiment_id": "exp_123"}'
    ...

Combined (using_attributes)

from phoenix.otel import using_attributes

with using_attributes(
    session_id="my-session-id",
    user_id="my-user-id",
    metadata={"environment": "production"},
    tags=["prod", "v2"],
    prompt_template="Answer: {question}",
    prompt_template_version="v1.0",
    prompt_template_variables={"question": "What is Phoenix?"},
):
    # All attributes applied to spans in this context
    ...

On a Single Span

span.set_attribute("metadata", json.dumps({"key": "value"}))
span.set_attribute("user.id", "user_123")
span.set_attribute("session.id", "session_456")

As Decorators

All context managers can be used as decorators:

from phoenix.otel import using_session, using_user, using_metadata

@using_session(session_id="my-session-id")
@using_user("my-user-id")
@using_metadata({"env": "prod"})
def my_function():
    ...

Phoenix Tracing: Custom Metadata (TypeScript)

Add custom attributes to spans for richer observability.

Using Context (Propagates to All Child Spans)

import { context } from "@arizeai/phoenix-otel";
import { setMetadata } from "@arizeai/openinference-core";

context.with(
  setMetadata(context.active(), {
    experiment_id: "exp_123",
    model_version: "gpt-4-1106-preview",
    environment: "production",
  }),
  async () => {
    // All spans created within this block will have:
    // "metadata" = '{"experiment_id": "exp_123", ...}'
    await myApp.run(query);
  }
);

On a Single Span

import { traceChain } from "@arizeai/openinference-core";
import { trace } from "@arizeai/phoenix-otel";

const myFunction = traceChain(
  async (input: string) => {
    const span = trace.getActiveSpan();

    span?.setAttribute(
      "metadata",
      JSON.stringify({
        experiment_id: "exp_123",
        model_version: "gpt-4-1106-preview",
        environment: "production",
      })
    );

    return result;
  },
  { name: "my-function" }
);

await myFunction("hello");

Phoenix Tracing: Production Guide (Python)

CRITICAL: Configure batching, data masking, and span filtering for production deployment.

Metadata

Attribute	Value
Priority	Critical - production readiness
Impact	Security, Performance
Setup Time	5-15 min

Batch Processing

Enable batch processing for production efficiency. Batching reduces network overhead by sending spans in groups rather than individually.

Data Masking (PII Protection)

Environment variables:

export OPENINFERENCE_HIDE_INPUTS=true          # Hide input.value
export OPENINFERENCE_HIDE_OUTPUTS=true         # Hide output.value
export OPENINFERENCE_HIDE_INPUT_MESSAGES=true  # Hide LLM input messages
export OPENINFERENCE_HIDE_OUTPUT_MESSAGES=true # Hide LLM output messages
export OPENINFERENCE_HIDE_INPUT_IMAGES=true    # Hide image content
export OPENINFERENCE_HIDE_INPUT_TEXT=true      # Hide embedding text
export OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH=10000  # Limit image size

Python TraceConfig:

from phoenix.otel import register
from openinference.instrumentation import TraceConfig

config = TraceConfig(
    hide_inputs=True,
    hide_outputs=True,
    hide_input_messages=True
)
register(trace_config=config)

Precedence: Code > Environment variables > Defaults

---

Span Filtering

Suppress specific code blocks:

from phoenix.otel import suppress_tracing

with suppress_tracing():
    internal_logging()  # No spans generated

Phoenix Tracing: Production Guide (TypeScript)

CRITICAL: Configure batching, data masking, and span filtering for production deployment.

Metadata

Attribute	Value
Priority	Critical - production readiness
Impact	Security, Performance
Setup Time	5-15 min

Batch Processing

Enable batch processing for production efficiency. Batching reduces network overhead by sending spans in groups rather than individually.

import { register } from "@arizeai/phoenix-otel";

const provider = register({
  projectName: "my-app",
  batch: true,  // Production default
});

Shutdown Handling

CRITICAL: Spans may not be exported if still queued in the processor when your process exits. Call provider.shutdown() to explicitly flush before exit.

// Explicit shutdown to flush queued spans
const provider = register({
  projectName: "my-app",
  batch: true,
});

async function main() {
  await doWork();
  await provider.shutdown();  // Flush spans before exit
}

main().catch(async (error) => {
  console.error(error);
  await provider.shutdown();  // Flush on error too
  process.exit(1);
});

Graceful termination signals:

// Graceful shutdown on SIGTERM
const provider = register({
  projectName: "my-server",
  batch: true,
});

process.on("SIGTERM", async () => {
  await provider.shutdown();
  process.exit(0);
});

---

Data Masking (PII Protection)

Environment variables:

export OPENINFERENCE_HIDE_INPUTS=true          # Hide input.value
export OPENINFERENCE_HIDE_OUTPUTS=true         # Hide output.value
export OPENINFERENCE_HIDE_INPUT_MESSAGES=true  # Hide LLM input messages
export OPENINFERENCE_HIDE_OUTPUT_MESSAGES=true # Hide LLM output messages
export OPENINFERENCE_HIDE_INPUT_IMAGES=true    # Hide image content
export OPENINFERENCE_HIDE_INPUT_TEXT=true      # Hide embedding text
export OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH=10000  # Limit image size

TypeScript TraceConfig:

import { register } from "@arizeai/phoenix-otel";
import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";

const traceConfig = {
  hideInputs: true,
  hideOutputs: true,
  hideInputMessages: true
};

const instrumentation = new OpenAIInstrumentation({ traceConfig });

Precedence: Code > Environment variables > Defaults

---

Span Filtering

Suppress specific code blocks:

import { suppressTracing } from "@opentelemetry/core";
import { context } from "@opentelemetry/api";

await context.with(suppressTracing(context.active()), async () => {
  internalLogging(); // No spans generated
});

Sampling:

export OTEL_TRACES_SAMPLER="parentbased_traceidratio"
export OTEL_TRACES_SAMPLER_ARG="0.1"  # Sample 10%

---

Error Handling

import { SpanStatusCode } from "@opentelemetry/api";

try {
  result = await riskyOperation();
  span?.setStatus({ code: SpanStatusCode.OK });
} catch (e) {
  span?.recordException(e);
  span?.setStatus({ code: SpanStatusCode.ERROR });
  throw e;
}

---

Production Checklist

[ ] Batch processing enabled
[ ] Shutdown handling: Call provider.shutdown() before exit to flush queued spans
[ ] Graceful termination: Flush spans on SIGTERM/SIGINT signals
[ ] Data masking configured (HIDE_INPUTS/HIDE_OUTPUTS if PII)
[ ] Span filtering for health checks/noisy paths
[ ] Error handling implemented
[ ] Graceful degradation if Phoenix unavailable
[ ] Performance tested
[ ] Monitoring configured (Phoenix UI checked)

Phoenix Tracing: Projects (Python)

Organize traces by application using projects (Phoenix's top-level grouping).

Overview

Projects group traces for a single application or experiment.

Use for: Environments (dev/staging/prod), A/B testing, versioning

Setup

Environment Variable (Recommended)

export PHOENIX_PROJECT_NAME="my-app-prod"

import os
os.environ["PHOENIX_PROJECT_NAME"] = "my-app-prod"
from phoenix.otel import register
register()  # Uses "my-app-prod"

Code

from phoenix.otel import register
register(project_name="my-app-prod")

Use Cases

Environments:

# Dev, staging, prod
register(project_name="my-app-dev")
register(project_name="my-app-staging")
register(project_name="my-app-prod")

A/B Testing:

# Compare models
register(project_name="chatbot-gpt4")
register(project_name="chatbot-claude")

Versioning:

# Track versions
register(project_name="my-app-v1")
register(project_name="my-app-v2")

Switching Projects (Python Notebooks Only)

from openinference.instrumentation import dangerously_using_project
from phoenix.otel import register

register(project_name="my-app")

# Switch temporarily for evals
with dangerously_using_project("my-eval-project"):
    run_evaluations()

⚠️ Only use in notebooks/scripts, not production.

Phoenix Tracing: Projects (TypeScript)

Organize traces by application using projects (Phoenix's top-level grouping).

Overview

Projects group traces for a single application or experiment.

Use for: Environments (dev/staging/prod), A/B testing, versioning

Setup

Environment Variable (Recommended)

export PHOENIX_PROJECT_NAME="my-app-prod"

process.env.PHOENIX_PROJECT_NAME = "my-app-prod";
import { register } from "@arizeai/phoenix-otel";
register();  // Uses "my-app-prod"

Code

import { register } from "@arizeai/phoenix-otel";
register({ projectName: "my-app-prod" });

Use Cases

Environments:

// Dev, staging, prod
register({ projectName: "my-app-dev" });
register({ projectName: "my-app-staging" });
register({ projectName: "my-app-prod" });

A/B Testing:

// Compare models
register({ projectName: "chatbot-gpt4" });
register({ projectName: "chatbot-claude" });

Versioning:

// Track versions
register({ projectName: "my-app-v1" });
register({ projectName: "my-app-v2" });

Sessions (Python)

Track multi-turn conversations by grouping traces with session IDs.

Setup

from phoenix.otel import using_session

with using_session(session_id="user_123_conv_456"):
    response = llm.invoke(prompt)

Best Practices

Bad: Only parent span gets session ID

from phoenix.otel import SpanAttributes
from opentelemetry import trace

span = trace.get_current_span()
span.set_attribute(SpanAttributes.SESSION_ID, session_id)
response = client.chat.completions.create(...)

Good: All child spans inherit session ID

with using_session(session_id):
    response = client.chat.completions.create(...)
    result = my_custom_function()

Why: using_session() propagates session ID to all nested spans automatically.

Session ID Patterns

import uuid

session_id = str(uuid.uuid4())
session_id = f"user_{user_id}_conv_{conversation_id}"
session_id = f"debug_{timestamp}"

Good: str(uuid.uuid4()), "user_123_conv_456" Bad: "session_1", "test", empty string

Multi-Turn Chatbot Example

import uuid
from phoenix.otel import using_session

session_id = str(uuid.uuid4())
messages = []

def send_message(user_input: str) -> str:
    messages.append({"role": "user", "content": user_input})

    with using_session(session_id):
        response = client.chat.completions.create(
            model="gpt-4",
            messages=messages
        )

    assistant_message = response.choices[0].message.content
    messages.append({"role": "assistant", "content": assistant_message})
    return assistant_message

Additional Attributes

from phoenix.otel import using_attributes

with using_attributes(
    user_id="user_123",
    session_id="conv_456",
    metadata={"tier": "premium", "region": "us-west"}
):
    response = llm.invoke(prompt)

LangChain Integration

LangChain threads are automatically recognized as sessions:

from langchain.chat_models import ChatOpenAI

response = llm.invoke(
    [HumanMessage(content="Hi!")],
    config={"metadata": {"thread_id": "user_123_thread"}}
)

Phoenix recognizes: thread_id, session_id, conversation_id

Sessions (TypeScript)

Track multi-turn conversations by grouping traces with session IDs. Use `withSpan` directly from `@arizeai/openinference-core` - no wrappers or custom utilities needed.

Core Concept

Session Pattern: 1. Generate a unique session.id once at application startup 2. Export SESSION_ID, import withSpan where needed 3. Use withSpan to create a parent CHAIN span with session.id for each interaction 4. All child spans (LLM, TOOL, AGENT, etc.) automatically group under the parent 5. Query traces by session.id in Phoenix to see all interactions

Implementation (Best Practice)

1. Setup (instrumentation.ts)

import { register } from "@arizeai/phoenix-otel";
import { randomUUID } from "node:crypto";

// Initialize Phoenix
register({
  projectName: "your-app",
  url: process.env.PHOENIX_COLLECTOR_ENDPOINT || "http://localhost:6006",
  apiKey: process.env.PHOENIX_API_KEY,
  batch: true,
});

// Generate and export session ID
export const SESSION_ID = randomUUID();

2. Usage (app code)

import { withSpan } from "@arizeai/openinference-core";
import { SESSION_ID } from "./instrumentation";

// Use withSpan directly - no wrapper needed
const handleInteraction = withSpan(
  async () => {
    const result = await agent.generate({ prompt: userInput });
    return result;
  },
  {
    name: "cli.interaction",
    kind: "CHAIN",
    attributes: { "session.id": SESSION_ID },
  }
);

// Call it
const result = await handleInteraction();

With Input Parameters

const processQuery = withSpan(
  async (query: string) => {
    return await agent.generate({ prompt: query });
  },
  {
    name: "process.query",
    kind: "CHAIN",
    attributes: { "session.id": SESSION_ID },
  }
);

await processQuery("What is 2+2?");

Key Points

Session ID Scope

CLI/Desktop Apps: Generate once at process startup
Web Servers: Generate per-user session (e.g., on login, store in session storage)
Stateless APIs: Accept session.id as a parameter from client

Span Hierarchy

cli.interaction (CHAIN) ← session.id here
├── ai.generateText (AGENT)
│   ├── ai.generateText.doGenerate (LLM)
│   └── ai.toolCall (TOOL)
└── ai.generateText.doGenerate (LLM)

The session.id is only set on the root span. Child spans are automatically grouped by the trace hierarchy.

Querying Sessions

# Get all traces for a session
npx @arizeai/phoenix-cli traces \
  --endpoint http://localhost:6006 \
  --project your-app \
  --format raw \
  --no-progress | \
  jq '.[] | select(.spans[0].attributes["session.id"] == "YOUR-SESSION-ID")'

Dependencies

{
  "dependencies": {
    "@arizeai/openinference-core": "^2.0.5",
    "@arizeai/phoenix-otel": "^0.4.1"
  }
}

Note: @opentelemetry/api is NOT needed - it's only for manual span management.

Why This Pattern?

1. Simple: Just export SESSION_ID, use withSpan directly - no wrappers 2. Built-in: withSpan from @arizeai/openinference-core handles everything 3. Type-safe: Preserves function signatures and type information 4. Automatic lifecycle: Handles span creation, error tracking, and cleanup 5. Framework-agnostic: Works with any LLM framework (AI SDK, LangChain, etc.) 6. No extra deps: Don't need @opentelemetry/api or custom utilities

Adding More Attributes

import { withSpan } from "@arizeai/openinference-core";
import { SESSION_ID } from "./instrumentation";

const handleWithContext = withSpan(
  async (userInput: string) => {
    return await agent.generate({ prompt: userInput });
  },
  {
    name: "cli.interaction",
    kind: "CHAIN",
    attributes: {
      "session.id": SESSION_ID,
      "user.id": userId,              // Track user
      "metadata.environment": "prod",  // Custom metadata
    },
  }
);

Anti-Pattern: Don't Create Wrappers

❌ Don't do this:

// Unnecessary wrapper
export function withSessionTracking(fn) {
  return withSpan(fn, { attributes: { "session.id": SESSION_ID } });
}

✅ Do this instead:

// Use withSpan directly
import { withSpan } from "@arizeai/openinference-core";
import { SESSION_ID } from "./instrumentation";

const handler = withSpan(fn, {
  attributes: { "session.id": SESSION_ID }
});

Alternative: Context API Pattern

For web servers or complex async flows where you need to propagate session IDs through middleware, you can use the Context API:

import { context } from "@opentelemetry/api";
import { setSession } from "@arizeai/openinference-core";

await context.with(
  setSession(context.active(), { sessionId: "user_123_conv_456" }),
  async () => {
    const response = await llm.invoke(prompt);
  }
);

Use Context API when:

Building web servers with middleware chains
Session ID needs to flow through many async boundaries
You don't control the call stack (e.g., framework-provided handlers)

Use withSpan when:

Building CLI apps or scripts
You control the function call points
Simpler, more explicit code is preferred

fundamentals-universal-attributes.md - Other universal attributes (user.id, metadata)
span-chain.md - CHAIN span specification
sessions-python.md - Python session tracking patterns

Phoenix Tracing: Python Setup

Setup Phoenix tracing in Python with `arize-phoenix-otel`.

Metadata

Attribute	Value
Priority	Critical - required for all tracing
Setup Time	<5 min

Quick Start (3 lines)

from phoenix.otel import register
register(project_name="my-app", auto_instrument=True)

Connects to `http://localhost:6006`, auto-instruments all supported libraries.

Installation

pip install arize-phoenix-otel

Supported: Python 3.10-3.13

Configuration

Environment Variables (Recommended)

export PHOENIX_API_KEY="your-api-key"  # Required for Phoenix Cloud
export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"  # Or Cloud URL
export PHOENIX_PROJECT_NAME="my-app"  # Optional

Python Code

from phoenix.otel import register

tracer_provider = register(
    project_name="my-app",              # Project name
    endpoint="http://localhost:6006",   # Phoenix endpoint
    auto_instrument=True,               # Auto-instrument supported libs
    batch=True,                         # Batch processing (default: True)
)

Parameters:

project_name: Project name (overrides PHOENIX_PROJECT_NAME)
endpoint: Phoenix URL (overrides PHOENIX_COLLECTOR_ENDPOINT)
auto_instrument: Enable auto-instrumentation (default: False)
batch: Use BatchSpanProcessor (default: True, production-recommended)
protocol: "http/protobuf" (default) or "grpc"

Auto-Instrumentation

Install instrumentors for your frameworks:

pip install openinference-instrumentation-openai      # OpenAI SDK
pip install openinference-instrumentation-langchain   # LangChain
pip install openinference-instrumentation-llama-index # LlamaIndex
# ... install others as needed

Then enable auto-instrumentation:

register(project_name="my-app", auto_instrument=True)

Phoenix discovers and instruments all installed OpenInference packages automatically.

Batch Processing (Production)

Enabled by default. Configure via environment variables:

export OTEL_BSP_SCHEDULE_DELAY=5000           # Batch every 5s
export OTEL_BSP_MAX_QUEUE_SIZE=2048           # Queue 2048 spans
export OTEL_BSP_MAX_EXPORT_BATCH_SIZE=512     # Send 512 spans/batch

Link: https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/

Verification

1. Open Phoenix UI: http://localhost:6006 2. Navigate to your project 3. Run your application 4. Check for traces (appear within batch delay)

Troubleshooting

No traces:

Verify PHOENIX_COLLECTOR_ENDPOINT matches Phoenix server
Set PHOENIX_API_KEY for Phoenix Cloud
Confirm instrumentors installed

Missing attributes:

Check span kind (see rules/ directory)
Verify attribute names (see rules/ directory)

Example

from phoenix.otel import register
from openai import OpenAI

# Enable tracing with auto-instrumentation
register(project_name="my-chatbot", auto_instrument=True)

# OpenAI automatically instrumented
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

API Reference

TypeScript Setup

Setup Phoenix tracing in TypeScript/JavaScript with @arizeai/phoenix-otel.

Metadata

Attribute	Value
Priority	Critical - required for all tracing
Setup Time	<5 min

Quick Start

npm install @arizeai/phoenix-otel

import { register } from "@arizeai/phoenix-otel";
register({ projectName: "my-app" });

Connects to http://localhost:6006 by default.

Configuration

import { register } from "@arizeai/phoenix-otel";

register({
  projectName: "my-app",
  url: "http://localhost:6006",
  apiKey: process.env.PHOENIX_API_KEY,
  batch: true
});

Environment variables:

export PHOENIX_API_KEY="your-api-key"
export PHOENIX_COLLECTOR_ENDPOINT="http://localhost:6006"
export PHOENIX_PROJECT_NAME="my-app"

ESM vs CommonJS

CommonJS (automatic):

const { register } = require("@arizeai/phoenix-otel");
register({ projectName: "my-app" });

const OpenAI = require("openai");

ESM (manual instrumentation required):

import { register, registerInstrumentations } from "@arizeai/phoenix-otel";
import { OpenAIInstrumentation } from "@arizeai/openinference-instrumentation-openai";
import OpenAI from "openai";

register({ projectName: "my-app" });

const instrumentation = new OpenAIInstrumentation();
instrumentation.manuallyInstrument(OpenAI);
registerInstrumentations({ instrumentations: [instrumentation] });

Why: ESM imports are hoisted, so manuallyInstrument() is needed.

Framework Integration

Next.js (App Router):

// instrumentation.ts
export async function register() {
  if (process.env.NEXT_RUNTIME === "nodejs") {
    const { register } = await import("@arizeai/phoenix-otel");
    register({ projectName: "my-nextjs-app" });
  }
}

Express.js:

import { register } from "@arizeai/phoenix-otel";

register({ projectName: "my-express-app" });

const app = express();

Flushing Spans Before Exit

CRITICAL: Spans may not be exported if still queued in the processor when your process exits. Call provider.shutdown() to explicitly flush before exit.

Standard pattern:

const provider = register({
  projectName: "my-app",
  batch: true,
});

async function main() {
  await doWork();
  await provider.shutdown();  // Flush spans before exit
}

main().catch(async (error) => {
  console.error(error);
  await provider.shutdown();  // Flush on error too
  process.exit(1);
});

Alternative:

// Use batch: false for immediate export (no shutdown needed)
register({
  projectName: "my-app",
  batch: false,
});

For production patterns including graceful termination, see production-typescript.md.

Verification

1. Open Phoenix UI: http://localhost:6006 2. Run your application 3. Check for traces in your project

Enable diagnostic logging:

import { DiagLogLevel, register } from "@arizeai/phoenix-otel";

register({
  projectName: "my-app",
  diagLogLevel: DiagLogLevel.DEBUG,
});

Troubleshooting

No traces:

Verify PHOENIX_COLLECTOR_ENDPOINT is correct
Set PHOENIX_API_KEY for Phoenix Cloud
For ESM: Ensure manuallyInstrument() is called
With `batch: true`: Call provider.shutdown() before exit to flush queued spans (see Flushing Spans section)

Traces missing:

With batch: true: Call await provider.shutdown() before process exit to flush queued spans
Alternative: Set batch: false for immediate export (no shutdown needed)

Missing attributes:

Check instrumentation is registered (ESM requires manual setup)
See instrumentation-auto-typescript.md

AGENT Spans

AGENT spans represent autonomous reasoning blocks (ReAct agents, planning loops, multi-step decision making).

Required: openinference.span.kind = "AGENT"

Example

{
  "openinference.span.kind": "AGENT",
  "input.value": "Book a flight to New York for next Monday",
  "output.value": "I've booked flight AA123 departing Monday at 9:00 AM"
}

CHAIN Spans

Purpose

CHAIN spans represent orchestration layers in your application (LangChain chains, custom workflows, application entry points). Often used as root spans.

Required Attributes

Attribute	Type	Description	Required
`openinference.span.kind`	String	Must be "CHAIN"	Yes

Common Attributes

CHAIN spans typically use Universal Attributes:

input.value - Input to the chain (user query, request payload)
output.value - Output from the chain (final response)
input.mime_type / output.mime_type - Format indicators

Example: Root Chain

{
  "openinference.span.kind": "CHAIN",
  "input.value": "{\"question\": \"What is the capital of France?\"}",
  "input.mime_type": "application/json",
  "output.value": "{\"answer\": \"The capital of France is Paris.\", \"sources\": [\"doc_123\"]}",
  "output.mime_type": "application/json",
  "session.id": "session_abc123",
  "user.id": "user_xyz789"
}

Example: Nested Sub-Chain

{
  "openinference.span.kind": "CHAIN",
  "input.value": "Summarize this document: ...",
  "output.value": "This document discusses..."
}

EMBEDDING Spans

Purpose

EMBEDDING spans represent vector generation operations (text-to-vector conversion for semantic search).

Required Attributes

Attribute	Type	Description	Required
`openinference.span.kind`	String	Must be "EMBEDDING"	Yes
`embedding.model_name`	String	Embedding model identifier	Recommended

Attribute Reference

Single Embedding

Attribute	Type	Description
`embedding.model_name`	String	Embedding model identifier
`embedding.text`	String	Input text to embed
`embedding.vector`	String (JSON array)	Generated embedding vector

Example:

{
  "embedding.model_name": "text-embedding-ada-002",
  "embedding.text": "What is machine learning?",
  "embedding.vector": "[0.023, -0.012, 0.045, ..., 0.001]"
}

Batch Embeddings

Attribute Pattern	Type	Description
`embedding.embeddings.{i}.embedding.text`	String	Text at index i
`embedding.embeddings.{i}.embedding.vector`	String (JSON array)	Vector at index i

Example:

{
  "embedding.model_name": "text-embedding-ada-002",
  "embedding.embeddings.0.embedding.text": "First document",
  "embedding.embeddings.0.embedding.vector": "[0.1, 0.2, 0.3, ..., 0.5]",
  "embedding.embeddings.1.embedding.text": "Second document",
  "embedding.embeddings.1.embedding.vector": "[0.6, 0.7, 0.8, ..., 0.9]"
}

Vector Format

Vectors stored as JSON array strings:

Dimensions: Typically 384, 768, 1536, or 3072
Format: "[0.123, -0.456, 0.789, ...]"
Precision: Usually 3-6 decimal places

Storage Considerations:

Large vectors can significantly increase trace size
Consider omitting vectors in production (keep embedding.text for debugging)
Use separate vector database for actual similarity search

Examples

Single Embedding

{
  "openinference.span.kind": "EMBEDDING",
  "embedding.model_name": "text-embedding-ada-002",
  "embedding.text": "What is machine learning?",
  "embedding.vector": "[0.023, -0.012, 0.045, ..., 0.001]",
  "input.value": "What is machine learning?",
  "output.value": "[0.023, -0.012, 0.045, ..., 0.001]"
}

Batch Embeddings

{
  "openinference.span.kind": "EMBEDDING",
  "embedding.model_name": "text-embedding-ada-002",
  "embedding.embeddings.0.embedding.text": "First document",
  "embedding.embeddings.0.embedding.vector": "[0.1, 0.2, 0.3]",
  "embedding.embeddings.1.embedding.text": "Second document",
  "embedding.embeddings.1.embedding.vector": "[0.4, 0.5, 0.6]",
  "embedding.embeddings.2.embedding.text": "Third document",
  "embedding.embeddings.2.embedding.vector": "[0.7, 0.8, 0.9]"
}

EVALUATOR Spans

Purpose

EVALUATOR spans represent quality assessment operations (answer relevance, faithfulness, hallucination detection).

Required Attributes

Attribute	Type	Description	Required
`openinference.span.kind`	String	Must be "EVALUATOR"	Yes

Common Attributes

Attribute	Type	Description
`input.value`	String	Content being evaluated
`output.value`	String	Evaluation result (score, label, explanation)
`metadata.evaluator_name`	String	Evaluator identifier
`metadata.score`	Float	Numeric score (0-1)
`metadata.label`	String	Categorical label (relevant/irrelevant)

Example: Answer Relevance

{
  "openinference.span.kind": "EVALUATOR",
  "input.value": "{\"question\": \"What is the capital of France?\", \"answer\": \"The capital of France is Paris.\"}",
  "input.mime_type": "application/json",
  "output.value": "0.95",
  "metadata.evaluator_name": "answer_relevance",
  "metadata.score": 0.95,
  "metadata.label": "relevant",
  "metadata.explanation": "Answer directly addresses the question with correct information"
}

Example: Faithfulness Check

{
  "openinference.span.kind": "EVALUATOR",
  "input.value": "{\"context\": \"Paris is in France.\", \"answer\": \"Paris is the capital of France.\"}",
  "input.mime_type": "application/json",
  "output.value": "0.5",
  "metadata.evaluator_name": "faithfulness",
  "metadata.score": 0.5,
  "metadata.label": "partially_faithful",
  "metadata.explanation": "Answer makes unsupported claim about Paris being the capital"
}

GUARDRAIL Spans

Purpose

GUARDRAIL spans represent safety and policy checks (content moderation, PII detection, toxicity scoring).

Required Attributes

Attribute	Type	Description	Required
`openinference.span.kind`	String	Must be "GUARDRAIL"	Yes

Common Attributes

Attribute	Type	Description
`input.value`	String	Content being checked
`output.value`	String	Guardrail result (allowed/blocked/flagged)
`metadata.guardrail_type`	String	Type of check (toxicity, pii, bias)
`metadata.score`	Float	Safety score (0-1)
`metadata.threshold`	Float	Threshold for blocking

Example: Content Moderation

{
  "openinference.span.kind": "GUARDRAIL",
  "input.value": "User message: I want to build a bomb",
  "output.value": "BLOCKED",
  "metadata.guardrail_type": "content_moderation",
  "metadata.score": 0.95,
  "metadata.threshold": 0.7,
  "metadata.categories": "[\"violence\", \"weapons\"]",
  "metadata.action": "block_and_log"
}

Example: PII Detection

{
  "openinference.span.kind": "GUARDRAIL",
  "input.value": "My SSN is 123-45-6789",
  "output.value": "FLAGGED",
  "metadata.guardrail_type": "pii_detection",
  "metadata.detected_pii": "[\"ssn\"]",
  "metadata.redacted_output": "My SSN is [REDACTED]"
}

LLM Spans

Represent calls to language models (OpenAI, Anthropic, local models, etc.).

Required Attributes

Attribute	Type	Description
`openinference.span.kind`	String	Must be "LLM"
`llm.model_name`	String	Model identifier (e.g., "gpt-4", "claude-3-5-sonnet-20241022")

Key Attributes

Category	Attributes	Example
Model	`llm.model_name`, `llm.provider`	"gpt-4-turbo", "openai"
Tokens	`llm.token_count.prompt`, `llm.token_count.completion`, `llm.token_count.total`	25, 8, 33
Cost	`llm.cost.prompt`, `llm.cost.completion`, `llm.cost.total`	0.0021, 0.0045, 0.0066
Parameters	`llm.invocation_parameters` (JSON)	`{"temperature": 0.7, "max_tokens": 1024}`
Messages	`llm.input_messages.{i}.`, `llm.output_messages.{i}.`	See examples below
Tools	`llm.tools.{i}.tool.json_schema`	Function definitions

Cost Tracking

Core attributes:

llm.cost.prompt - Total input cost (USD)
llm.cost.completion - Total output cost (USD)
llm.cost.total - Total cost (USD)

Detailed cost breakdown:

llm.cost.prompt_details.{input,cache_read,cache_write,audio} - Input cost components
llm.cost.completion_details.{output,reasoning,audio} - Output cost components

Messages

Input messages:

llm.input_messages.{i}.message.role - "user", "assistant", "system", "tool"
llm.input_messages.{i}.message.content - Text content
llm.input_messages.{i}.message.contents.{j} - Multimodal (text + images)
llm.input_messages.{i}.message.tool_calls - Tool invocations

Output messages: Same structure as input messages.

Example: Basic LLM Call

{
  "openinference.span.kind": "LLM",
  "llm.model_name": "claude-3-5-sonnet-20241022",
  "llm.invocation_parameters": "{\"temperature\": 0.7, \"max_tokens\": 1024}",
  "llm.input_messages.0.message.role": "system",
  "llm.input_messages.0.message.content": "You are a helpful assistant.",
  "llm.input_messages.1.message.role": "user",
  "llm.input_messages.1.message.content": "What is the capital of France?",
  "llm.output_messages.0.message.role": "assistant",
  "llm.output_messages.0.message.content": "The capital of France is Paris.",
  "llm.token_count.prompt": 25,
  "llm.token_count.completion": 8,
  "llm.token_count.total": 33
}

Example: LLM with Tool Calls

{
  "openinference.span.kind": "LLM",
  "llm.model_name": "gpt-4-turbo",
  "llm.input_messages.0.message.content": "What's the weather in SF?",
  "llm.output_messages.0.message.tool_calls.0.tool_call.function.name": "get_weather",
  "llm.output_messages.0.message.tool_calls.0.tool_call.function.arguments": "{\"location\": \"San Francisco\"}",
  "llm.tools.0.tool.json_schema": "{\"type\": \"function\", \"function\": {\"name\": \"get_weather\"}}"
}

RERANKER Spans

Purpose

RERANKER spans represent reordering of retrieved documents (Cohere Rerank, cross-encoder models).

Required Attributes

Attribute	Type	Description	Required
`openinference.span.kind`	String	Must be "RERANKER"	Yes

Attribute Reference

Reranker Parameters

Attribute	Type	Description
`reranker.model_name`	String	Reranker model identifier
`reranker.query`	String	Query used for reranking
`reranker.top_k`	Integer	Number of documents to return

Input Documents

Attribute Pattern	Type	Description
`reranker.input_documents.{i}.document.id`	String	Input document ID
`reranker.input_documents.{i}.document.content`	String	Input document content
`reranker.input_documents.{i}.document.score`	Float	Original retrieval score
`reranker.input_documents.{i}.document.metadata`	String (JSON)	Document metadata

Output Documents

Attribute Pattern	Type	Description
`reranker.output_documents.{i}.document.id`	String	Output document ID (reordered)
`reranker.output_documents.{i}.document.content`	String	Output document content
`reranker.output_documents.{i}.document.score`	Float	New reranker score
`reranker.output_documents.{i}.document.metadata`	String (JSON)	Document metadata

Score Comparison

Input scores (from retriever) vs. output scores (from reranker):

{
  "reranker.input_documents.0.document.id": "doc_A",
  "reranker.input_documents.0.document.score": 0.7,
  "reranker.input_documents.1.document.id": "doc_B",
  "reranker.input_documents.1.document.score": 0.9,
  "reranker.output_documents.0.document.id": "doc_B",
  "reranker.output_documents.0.document.score": 0.95,
  "reranker.output_documents.1.document.id": "doc_A",
  "reranker.output_documents.1.document.score": 0.85
}

In this example:

Input: doc_B (0.9) ranked higher than doc_A (0.7)
Output: doc_B still highest but both scores increased
Reranker confirmed retriever's ordering but refined scores

Examples

Complete Reranking Example

{
  "openinference.span.kind": "RERANKER",
  "reranker.model_name": "cohere-rerank-v2",
  "reranker.query": "What is machine learning?",
  "reranker.top_k": 2,
  "reranker.input_documents.0.document.id": "doc_123",
  "reranker.input_documents.0.document.content": "Machine learning is a subset...",
  "reranker.input_documents.1.document.id": "doc_456",
  "reranker.input_documents.1.document.content": "Supervised learning algorithms...",
  "reranker.input_documents.2.document.id": "doc_789",
  "reranker.input_documents.2.document.content": "Neural networks are...",
  "reranker.output_documents.0.document.id": "doc_456",
  "reranker.output_documents.0.document.content": "Supervised learning algorithms...",
  "reranker.output_documents.0.document.score": 0.95,
  "reranker.output_documents.1.document.id": "doc_123",
  "reranker.output_documents.1.document.content": "Machine learning is a subset...",
  "reranker.output_documents.1.document.score": 0.88
}

RETRIEVER Spans

Purpose

RETRIEVER spans represent document/context retrieval operations (vector DB queries, semantic search, keyword search).

Required Attributes

Attribute	Type	Description	Required
`openinference.span.kind`	String	Must be "RETRIEVER"	Yes

Attribute Reference

Query

Attribute	Type	Description
`input.value`	String	Search query text

Document Schema

Attribute Pattern	Type	Description
`retrieval.documents.{i}.document.id`	String	Unique document identifier
`retrieval.documents.{i}.document.content`	String	Document text content
`retrieval.documents.{i}.document.score`	Float	Relevance score (0-1 or distance)
`retrieval.documents.{i}.document.metadata`	String (JSON)	Document metadata

Flattening Pattern for Documents

Documents are flattened using zero-indexed notation:

retrieval.documents.0.document.id
retrieval.documents.0.document.content
retrieval.documents.0.document.score
retrieval.documents.1.document.id
retrieval.documents.1.document.content
retrieval.documents.1.document.score
...

Document Metadata

Common metadata fields (stored as JSON string):

{
  "source": "knowledge_base.pdf",
  "page": 42,
  "section": "Introduction",
  "author": "Jane Doe",
  "created_at": "2024-01-15",
  "url": "https://example.com/doc",
  "chunk_id": "chunk_123"
}

Example with metadata:

{
  "retrieval.documents.0.document.id": "doc_123",
  "retrieval.documents.0.document.content": "Machine learning is a method of data analysis...",
  "retrieval.documents.0.document.score": 0.92,
  "retrieval.documents.0.document.metadata": "{\"source\": \"ml_textbook.pdf\", \"page\": 15, \"chapter\": \"Introduction\"}"
}

Ordering

Documents are ordered by index (0, 1, 2, ...). Typically:

Index 0 = highest scoring document
Index 1 = second highest
etc.

Preserve retrieval order in your flattened attributes.

Large Document Handling

For very long documents:

Consider truncating document.content to first N characters
Store full content in separate document store
Use document.id to reference full content

Examples

Basic Vector Search

{
  "openinference.span.kind": "RETRIEVER",
  "input.value": "What is machine learning?",
  "retrieval.documents.0.document.id": "doc_123",
  "retrieval.documents.0.document.content": "Machine learning is a subset of artificial intelligence...",
  "retrieval.documents.0.document.score": 0.92,
  "retrieval.documents.0.document.metadata": "{\"source\": \"textbook.pdf\", \"page\": 42}",
  "retrieval.documents.1.document.id": "doc_456",
  "retrieval.documents.1.document.content": "Machine learning algorithms learn patterns from data...",
  "retrieval.documents.1.document.score": 0.87,
  "retrieval.documents.1.document.metadata": "{\"source\": \"article.html\", \"author\": \"Jane Doe\"}",
  "retrieval.documents.2.document.id": "doc_789",
  "retrieval.documents.2.document.content": "Supervised learning is a type of machine learning...",
  "retrieval.documents.2.document.score": 0.81,
  "retrieval.documents.2.document.metadata": "{\"source\": \"wiki.org\"}",
  "metadata.retriever_type": "vector_search",
  "metadata.vector_db": "pinecone",
  "metadata.top_k": 3
}

TOOL Spans

Purpose

TOOL spans represent external tool or function invocations (API calls, database queries, calculators, custom functions).

Required Attributes

Attribute	Type	Description	Required
`openinference.span.kind`	String	Must be "TOOL"	Yes
`tool.name`	String	Tool/function name	Recommended

Attribute Reference

Tool Execution Attributes

Attribute	Type	Description
`tool.name`	String	Tool/function name
`tool.description`	String	Tool purpose/description
`tool.parameters`	String (JSON)	JSON schema defining the tool's parameters
`input.value`	String (JSON)	Actual input values passed to the tool
`output.value`	String	Tool output/result
`output.mime_type`	String	Result content type (e.g., "application/json")

Examples

API Call Tool

{
  "openinference.span.kind": "TOOL",
  "tool.name": "get_weather",
  "tool.description": "Fetches current weather for a location",
  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"location\": {\"type\": \"string\"}, \"units\": {\"type\": \"string\", \"enum\": [\"celsius\", \"fahrenheit\"]}}, \"required\": [\"location\"]}",
  "input.value": "{\"location\": \"San Francisco\", \"units\": \"celsius\"}",
  "output.value": "{\"temperature\": 18, \"conditions\": \"partly cloudy\"}"
}

Calculator Tool

{
  "openinference.span.kind": "TOOL",
  "tool.name": "calculator",
  "tool.description": "Performs mathematical calculations",
  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"expression\": {\"type\": \"string\", \"description\": \"Math expression to evaluate\"}}, \"required\": [\"expression\"]}",
  "input.value": "{\"expression\": \"2 + 2\"}",
  "output.value": "4"
}

Database Query Tool

{
  "openinference.span.kind": "TOOL",
  "tool.name": "sql_query",
  "tool.description": "Executes SQL query on user database",
  "tool.parameters": "{\"type\": \"object\", \"properties\": {\"query\": {\"type\": \"string\", \"description\": \"SQL query to execute\"}}, \"required\": [\"query\"]}",
  "input.value": "{\"query\": \"SELECT * FROM users WHERE id = 123\"}",
  "output.value": "[{\"id\": 123, \"name\": \"Alice\", \"email\": \"alice@example.com\"}]",
  "output.mime_type": "application/json"
}

Related skills

Setup Matt Pocock SkillsScaffold the per-repo configuration that Matt Pocock’s engineering agent skills rely on so they understand the issue tracker, triage labels, and domain documentation la462k185k

Lark Skill MakerQuickly turn any Lark/Feishu OpenAPI call or multi-step workflow into a reusable agent skill with its own SKILL.md.379k15.8k

CavemanSlash token usage by roughly 75% while keeping every technical detail intact when working with Claude Code, Cursor or similar agents.378k92.5k

Lark AppsConnect Claude, Cursor or custom agents directly to Lark (Feishu) for messaging, document automation, approval workflows and enterprise data access.375k

Running Claude Code Via Litellm CopilotRun Claude Code at a fraction of the cost by routing requests through LiteLLM to the GitHub Copilot Chat API.270k72

Codex PetGenerate a complete Codex Pet spritesheet and metadata from one reference image without needing an OpenAI key or Codex Pro.246k8

How it compares

Choose phoenix-tracing when you already run or plan to run Arize Phoenix and need OpenInference-correct spans rather than generic logging wrappers.

FAQ

Which languages?

Python and TypeScript with separate setup reference guides.

What packages are required?

arize-phoenix-otel for Python or @arizeai/phoenix-otel for TypeScript.

Where to start?

Setup references then instrumentation auto or manual guides.

Is Phoenix Tracing safe to install?

skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

AI & Agent Buildingagents

About

Phoenix Tracing by the numbers

What phoenix-tracing says it does

Add your badge

How do I set up Phoenix tracing with correct OpenInference spans?

Who is it for?

When should I use this skill?

What you get

Files

Phoenix Tracing

When to Apply

Reference Categories

Quick Reference

1. Setup (START HERE)

2. Instrumentation

3. Span Types (with full attribute schemas)

4. Organization

5. Enrichment

6. Production (CRITICAL)

7. Feedback

Reference Files

Common Workflows

How to Use This Skill

References

Phoenix Tracing Skill

Usage

File Organization

Reference

Annotations Overview

Annotation Types

Annotation Fields

Required Fields

Result Fields (At Least One Required)

Optional Fields

Annotator Kinds

Examples

Python SDK Annotation Patterns

Client Setup

Span Annotations

Document Annotations

Trace Annotations

Span Notes

Session Annotations

RAG Pipeline Example

API Reference

TypeScript SDK Annotation Patterns

Client Setup

Span Annotations

Span Notes

Document Annotations

Trace Annotations

Trace Notes

Session Annotations

RAG Pipeline Example

API Reference

Flattening Convention

Flattening Rules

Complete Example

Overview and Traces & Spans

Overview

Traces and Spans

Trace Hierarchy

Context Propagation

Span Lifecycle

Required and Recommended Attributes

Required Attribute

Highly Recommended Attributes

Input/Output Values

Valid Span Kinds

Universal Attributes

Overview

Input/Output

Why Capture I/O?

Session and User Tracking

Metadata

Phoenix Tracing: Auto-Instrumentation (Python)

Overview

Supported Frameworks

Setup

Limitations