Observability Llm Obs

Name: Observability Llm Obs
Author: elastic

elastic/agent-skills

Answer questions about LLM and agentic app health, token spend, quality, and trace chains using data already ingested into Elasticsearch.

Overview

Observability LLM Obs is an agent skill most often used in Operate (also Grow analytics) that answers LLM monitoring questions using ES|QL and Elasticsearch APIs on ingested trace and metric data.

Install

npx skills add https://github.com/elastic/agent-skills --skill observability-llm-obs

What is this skill?

Scopes answers to Elastic-ingested traces and metrics (APM, OTLP, OpenLLMetry/OpenLIT/Langtrace paths)
Uses ES|QL, Elasticsearch APIs, and Kibana APIs without requiring the Kibana UI
Discovers which ingestion path exists (`traces*`, OTel generic streams) before querying
Covers LLM performance, token/cost utilization, response quality, and agent workflow orchestration
Aligns with Elastic OpenTelemetry LLM use-case documentation for EDOT-style deployments
Skill metadata version 0.1.0

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1k installs on skills.sh; 502 GitHub stars; 3/3 security scanners passed (skills.sh audits).

What problem does it solve?

You run agents in production but cannot tie latency, token cost, and multi-hop workflows to real telemetry in your Elastic deployment.

Who is it for?

Indie builders and tiny teams standardized on Elastic/Kibana who export OpenTelemetry or APM spans from LLM and agent frameworks.

Skip if: Builders without Elasticsearch observability plumbing who need setup guides first—this skill queries existing data rather than installing agents from scratch.

When should I use this skill?

User asks about LLM monitoring, GenAI observability, AI cost/quality, or workflow orchestration on Elastic-ingested data.

What do I get? / Deliverables

You get ES|QL- and API-grounded visibility into LLM performance, spend, and orchestration chains from whatever trace sources your cluster already has.

ES|QL or API query patterns for LLM traces
Interpretation of performance, cost, and chain quality from cluster data

Recommended Skills

Microsoft Foundrymicrosoft/azure-skills

Microsoft Foundry skill guides agents through the full Azure AI Foundry lifecycle—containerizing agents, pushing to ACR,…377k installs·1.2k stars

Azure Aimicrosoft/azure-skills

azure-ai is a Prism-oriented quick reference for Microsoft Azure AI work, with the published body centered on the Azure …375k installs·1.2k stars

Azure Hosted Copilot Sdkmicrosoft/azure-skills

Azure Hosted Copilot SDK is Microsoft's entry skill for repos using @github/copilot-sdk—it detects CopilotClient usage, …346k installs·1.2k stars

Lark Eventlarksuite/cli

Lark real-time subscription skill via lark-cli event consume for building bots and streaming webhook-style agent workers…208k installs·13.7k stars

Running Claude Code Via Litellm Copilotxixu-me/skills

Running Claude Code via LiteLLM Copilot walks through pointing Claude Code at a local LiteLLM proxy that forwards Anthro…200k installs·61 stars

Setup Matt Pocock Skillsmattpocock/skills

One-time per-repo setup so Matt Pocock engineering skills share correct issue tracker, triage strings, and domain docume…180k installs·121k stars

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

OperateMonitoring & observability

Operate → Monitoring is the primary shelf because the skill assumes production telemetry and ES|QL/API investigation—not greenfield instrumentation design alone. Monitoring fits ongoing trace, metric, and cost visibility for GenAI workloads rather than launch-time distribution work.

Also useful

GrowAnalytics & insights

Where it fits

Example use

OperateMonitoring & observability

Trace a sudden latency regression across multi-step agent spans in `traces*` indices.

Example use

GrowAnalytics & insights

Compare weekly token utilization and cost drivers per model from ingested OTel metrics.

Example use

ShipCI/CD & deploy

Pre-launch sanity-check that production LLM spans and required fields land in Elasticsearch before go-live.

How it compares

Elastic-data interrogation skill, not a generic prompt-quality coach or a standalone SaaS APM product.

Common Questions / FAQ

Who is observability-llm-obs for?

Developers operating LLM or agentic apps on Elastic who want an agent to query traces, costs, and quality from ingested observability data.

When should I use observability-llm-obs?

In Operate when incidents or spend spikes need trace-backed answers; in Grow when analyzing token trends; whenever users mention LLM monitoring, GenAI observability, or AI cost on Elastic.

Is observability-llm-obs safe to install?

It instructs API access to your Elastic deployment; review the Security Audits panel on this page and scope cluster credentials and network access carefully.

SKILL.md

READMESKILL.md - Observability Llm Obs

# LLM and Agentic Observability

Answer user questions about monitoring LLMs and agentic components using **data ingested into Elastic** only. Focus on
LLM performance, cost and token utilization, response quality, and call chaining or agentic workflow orchestration. Use
**ES|QL**, Elasticsearch APIs, and (where needed) Kibana APIs. Do not rely on Kibana UI; the skill works without it. A
given deployment typically uses **one or more** ingestion paths (APM/OTLP traces **and/or** integration metrics/logs)—
discover what is available before querying.

## Where to look

- **Trace and metrics data (APM / OTel):** Trace data in Elastic is stored in **`traces*`** when collected by the
  Elastic APM Agent, and in **`traces-generic.otel-default`** (and similar) when collected by OpenTelemetry. Use the
  generic pattern **`traces*`** to find all trace data regardless of source. When the application is instrumented with
  OpenTelemetry (e.g. Elastic
  [Distributions of OpenTelemetry (EDOT)](https://www.elastic.co/docs/solutions/observability/get-started/opentelemetry/use-cases/llms),
  OpenLLMetry, OpenLIT, Langtrace exporting to OTLP), LLM and agent spans land in these trace data streams; metrics may
  land in **`metrics-apm*`** or metrics-generic. Query **`traces*`** and **`metrics*`** data streams for per-request and
  aggregated LLM signals.
- **Integration metrics and logs:** When the user collects data via
  [Elastic LLM integrations](https://www.elastic.co/docs/solutions/observability/applications/llm-observability)
  (OpenAI, Azure OpenAI, Azure AI Foundry, Amazon Bedrock, Bedrock AgentCore, GCP Vertex AI, etc.), metrics and logs go
  to **integration data streams** (e.g. `metrics*`, `logs*` with dataset/namespace per integration). Check which data
  streams exist.
- **Discover first:** Use Elasticsearch to list data streams or indices (e.g. `GET _data_stream`, or
  `GET traces*/_mapping`, `GET metrics*/_mapping`) and optionally sample a document to see which LLM-related fields are
  present. Do not assume both APM and integration data exist.
- **ES|QL:** Use the **elasticsearch-esql** skill for ES|QL syntax, commands, and query patterns when building queries
  against `traces*` or metrics data streams.
- **Alerts and SLOs:** Use the [Observability APIs](https://www.elastic.co/docs/solutions/observability/apis) **SLOs
  API** ([Stack](https://www.elastic.co/docs/api/doc/kibana/group/endpoint-slo) |
  [Serverless](https://www.elastic.co/docs/api/doc/serverless/group/endpoint-slo)) and **Alerting API**
  ([Stack](https://www.elastic.co/docs/api/doc/kibana/group/endpoint-alerting) |
  [Serverless](https://www.elastic.co/docs/api/doc/serverless/group/endpoint-alerting)) to find SLOs and alerting rules
  that target LLM-related data (e.g. services backed by `traces*`, or integration metrics). Firing alerts or
  violated/degrading SLOs point to potential degraded performance.

## Data available in Elastic

### From traces and metrics (traces*, metrics-apm* / metrics-generic)

Spans from OTel/EDOT (and compatible SDKs) carry **span attributes** that may follow
[OpenTelemetry GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/) or
provider-specific names. In Elasticsearch, attributes typically appear under `span.attributes` (exact key names depend
on ingestion). Common attributes:

| Purpose              | Example attribute names (OTel GenAI)                      |
| -------------------- | --------------------------------------------------------- |
| Operation / provider | `gen_ai.operation.name`, `gen_ai.provider.name`           |
| Model                | `gen_ai.request.model`, `gen_ai.response.model`

What is this skill?

Scopes answers to Elastic-ingested traces and metrics (APM, OTLP, OpenLLMetry/OpenLIT/Langtrace paths)

Uses ES|QL, Elasticsearch APIs, and Kibana APIs without requiring the Kibana UI

Discovers which ingestion path exists (`traces*`, OTel generic streams) before querying

Covers LLM performance, token/cost utilization, response quality, and agent workflow orchestration

Aligns with Elastic OpenTelemetry LLM use-case documentation for EDOT-style deployments

Skill metadata version 0.1.0

Compatible agents: Claude Code, Cursor, Codex, any compatible agent

Adoption & trust: 1k installs on skills.sh; 502 GitHub stars; 3/3 security scanners passed (skills.sh audits).

Journey fit

Spans multiple journey phases - primary shelf plus alternate fits below.

Primary fit

OperateMonitoring & observability

Also useful

GrowAnalytics & insights

Where it fits

Example use

OperateMonitoring & observability

Trace a sudden latency regression across multi-step agent spans in `traces*` indices.

Example use

GrowAnalytics & insights

Compare weekly token utilization and cost drivers per model from ingested OTel metrics.

Example use

ShipCI/CD & deploy

Pre-launch sanity-check that production LLM spans and required fields land in Elasticsearch before go-live.

SKILL.md

READMESKILL.md - Observability Llm Obs

# LLM and Agentic Observability

Answer user questions about monitoring LLMs and agentic components using **data ingested into Elastic** only. Focus on
LLM performance, cost and token utilization, response quality, and call chaining or agentic workflow orchestration. Use
**ES|QL**, Elasticsearch APIs, and (where needed) Kibana APIs. Do not rely on Kibana UI; the skill works without it. A
given deployment typically uses **one or more** ingestion paths (APM/OTLP traces **and/or** integration metrics/logs)—
discover what is available before querying.

## Where to look

- **Trace and metrics data (APM / OTel):** Trace data in Elastic is stored in **`traces*`** when collected by the
  Elastic APM Agent, and in **`traces-generic.otel-default`** (and similar) when collected by OpenTelemetry. Use the
  generic pattern **`traces*`** to find all trace data regardless of source. When the application is instrumented with
  OpenTelemetry (e.g. Elastic
  [Distributions of OpenTelemetry (EDOT)](https://www.elastic.co/docs/solutions/observability/get-started/opentelemetry/use-cases/llms),
  OpenLLMetry, OpenLIT, Langtrace exporting to OTLP), LLM and agent spans land in these trace data streams; metrics may
  land in **`metrics-apm*`** or metrics-generic. Query **`traces*`** and **`metrics*`** data streams for per-request and
  aggregated LLM signals.
- **Integration metrics and logs:** When the user collects data via
  [Elastic LLM integrations](https://www.elastic.co/docs/solutions/observability/applications/llm-observability)
  (OpenAI, Azure OpenAI, Azure AI Foundry, Amazon Bedrock, Bedrock AgentCore, GCP Vertex AI, etc.), metrics and logs go
  to **integration data streams** (e.g. `metrics*`, `logs*` with dataset/namespace per integration). Check which data
  streams exist.
- **Discover first:** Use Elasticsearch to list data streams or indices (e.g. `GET _data_stream`, or
  `GET traces*/_mapping`, `GET metrics*/_mapping`) and optionally sample a document to see which LLM-related fields are
  present. Do not assume both APM and integration data exist.
- **ES|QL:** Use the **elasticsearch-esql** skill for ES|QL syntax, commands, and query patterns when building queries
  against `traces*` or metrics data streams.
- **Alerts and SLOs:** Use the [Observability APIs](https://www.elastic.co/docs/solutions/observability/apis) **SLOs
  API** ([Stack](https://www.elastic.co/docs/api/doc/kibana/group/endpoint-slo) |
  [Serverless](https://www.elastic.co/docs/api/doc/serverless/group/endpoint-slo)) and **Alerting API**
  ([Stack](https://www.elastic.co/docs/api/doc/kibana/group/endpoint-alerting) |
  [Serverless](https://www.elastic.co/docs/api/doc/serverless/group/endpoint-alerting)) to find SLOs and alerting rules
  that target LLM-related data (e.g. services backed by `traces*`, or integration metrics). Firing alerts or
  violated/degrading SLOs point to potential degraded performance.

## Data available in Elastic

### From traces and metrics (traces*, metrics-apm* / metrics-generic)

Spans from OTel/EDOT (and compatible SDKs) carry **span attributes** that may follow
[OpenTelemetry GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/) or
provider-specific names. In Elasticsearch, attributes typically appear under `span.attributes` (exact key names depend
on ingestion). Common attributes:

| Purpose              | Example attribute names (OTel GenAI)                      |
| -------------------- | --------------------------------------------------------- |
| Operation / provider | `gen_ai.operation.name`, `gen_ai.provider.name`           |
| Model                | `gen_ai.request.model`, `gen_ai.response.model`

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is observability-llm-obs for?

When should I use observability-llm-obs?

Is observability-llm-obs safe to install?

SKILL.md

This week for builders

Overview

Install

What is this skill?

What problem does it solve?

Who is it for?

When should I use this skill?

What do I get? / Deliverables

Recommended Skills

Journey fit

Where it fits

Who is observability-llm-obs for?

When should I use observability-llm-obs?

Is observability-llm-obs safe to install?

SKILL.md