
Adk Observability Guide
Configure tracing, logging, and analytics for Google ADK agents in production so you can debug real traffic and improve performance.
Overview
ADK Observability Guide is an agent skill most often used in Operate (also Ship, Build) that documents Cloud Trace, logging, BigQuery Agent Analytics, and troubleshooting for Google ADK agents.
Install
npx skills add https://github.com/google/adk-docs --skill adk-observability-guideWhat is this skill?
- MUST READ frontmatter before ADK observability setup or production traffic analysis
- Observability tiers table to choose Cloud Trace, logging, and analytics depth
- Scaffold path: Terraform-preconfigured trace and logging with reference verification commands
- BigQuery Agent Analytics plugin coverage including GCS offloading and tool provenance
- Troubleshooting-oriented guidance for understanding deployed agent behavior under real load
- Documents an observability tiers comparison table for scope and default-state decisions
- References dedicated files for cloud trace/logging and BigQuery Agent Analytics plugin setup
Adoption & trust: 2.6k installs on skills.sh; 1.4k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
Your ADK agent is live but you cannot trace requests, inspect prompt-response pairs, or analyze tool usage at production scale.
Who is it for?
Indie builders and small teams running scaffolded or hand-rolled ADK deployments on GCP who need a single checklist before turning on monitoring.
Skip if: Pure local prototyping with no deploy path, or teams not using Google ADK who only need generic APM docs.
When should I use this skill?
MUST READ before setting up observability for ADK agents or when analyzing production traffic, debugging agent behavior, or improving agent performance.
What do I get? / Deliverables
You select an observability tier, enable trace and logging (or BQ analytics), and gain verifiable signals to debug agent behavior and improve performance on real traffic.
- Observability tier selection aligned to traffic and debugging needs
- Configured trace and prompt-response logging (or BQ analytics) with verification steps
- Reference-backed troubleshooting path for production agent behavior
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Operate → monitoring is the canonical shelf because the guide targets live traffic analysis, production debugging, and ongoing performance tuning after deploy. Content spans Cloud Trace, prompt-response logging, BigQuery Agent Analytics, and tiered observability—core production monitoring concerns.
Where it fits
Plan env vars and trace hooks while scaffolding an ADK service so monitoring is not bolted on late.
Verify Terraform-provisioned trace and logging before exposing the agent to beta users.
Use observability tiers to enable Cloud Trace and prompt-response logging for incident triage.
Turn on BigQuery Agent Analytics to study conversation patterns and tool provenance for product improvements.
How it compares
Framework-specific observability guide for ADK on GCP—not a generic log-shipping skill or a security pentest workflow.
Common Questions / FAQ
Who is adk-observability-guide for?
It is for developers operating Google ADK agents who need tracing, logging, and analytics aligned with ADK docs and optional Terraform scaffolds.
When should I use adk-observability-guide?
Use it before first production deploy to configure observability, when analyzing live agent traffic, when debugging tool or prompt failures, and during Operate iterations to improve agent quality.
Is adk-observability-guide safe to install?
It describes GCP logging and analytics that handle production traffic; review the Security Audits panel on this Prism page and your data-retention policies before enabling prompt-response logging.
SKILL.md
READMESKILL.md - Adk Observability Guide
# ADK Observability Guide > **Scaffolded project?** Cloud Trace and prompt-response logging are pre-configured by Terraform. See `references/cloud-trace-and-logging.md` for infrastructure details, env vars, and verification commands. > > **No scaffold?** Follow the ADK docs links below for manual setup. For production infrastructure, scaffold with `/adk-scaffold`. ### Reference Files | File | Contents | |------|----------| | `references/cloud-trace-and-logging.md` | Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally | | `references/bigquery-agent-analytics.md` | BQ Agent Analytics plugin — enabling, key features, GCS offloading, tool provenance | --- ## Observability Tiers Choose the right level of observability based on your needs: | Tier | What It Does | Scope | Default State | Best For | |------|-------------|-------|---------------|----------| | **Cloud Trace** | Distributed tracing — execution flow, latency, errors via OpenTelemetry spans | All templates, all environments | Always enabled | Debugging latency, understanding agent execution flow | | **Prompt-Response Logging** | GenAI interactions exported to GCS, BigQuery, and Cloud Logging | ADK agents only | Disabled locally, enabled when deployed | Auditing LLM interactions, compliance | | **BigQuery Agent Analytics** | Structured agent events (LLM calls, tool use, outcomes) to BigQuery | ADK agents with plugin enabled | Opt-in (`--bq-analytics` at scaffold time) | Conversational analytics, custom dashboards, LLM-as-judge evals | | **Third-Party Integrations** | External observability platforms (AgentOps, Phoenix, MLflow, etc.) | Any ADK agent | Opt-in, per-provider setup | Team collaboration, specialized visualization, prompt management | **Ask the user** which tier(s) they need — they can be combined. Cloud Trace is always on; the others are additive. --- ## Cloud Trace ADK uses OpenTelemetry to emit distributed traces. Every agent invocation produces spans that track the full execution flow. ### Span Hierarchy ``` invocation └── agent_run (one per agent in the chain) ├── call_llm (model request/response) └── execute_tool (tool execution) ``` ### Setup by Deployment Type | Deployment | Setup | |-----------|-------| | **Agent Engine** | Automatic — traces are exported to Cloud Trace by default | | **Cloud Run (scaffolded)** | Automatic — `otel_to_cloud=True` in the FastAPI app | | **GKE (scaffolded)** | Automatic — `otel_to_cloud=True` in the FastAPI app | | **Cloud Run / GKE (manual)** | Configure OpenTelemetry exporter in your app | | **Local dev** | Works with `make playground`; traces visible in Cloud Console | View traces: **Cloud Console → Trace → Trace explorer** For detailed setup instructions (Agent Engine CLI/SDK, Cloud Run, custom deployments), fetch `https://adk.dev/integrations/cloud-trace/index.md`. --- ## Prompt-Response Logging Captures GenAI interactions (model name, tokens, timing) and exports to GCS (JSONL), BigQuery (external tables), and Cloud Logging (dedicated bucket). Privacy-preserving by default — only metadata is logged unless explicitly configured otherwise. Key env var: `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` — set to `NO_CONTENT` (metadata only, default in deployed envs), `true` (full content), or `false` (disabled). Logging is disabled locally unless `LOGS_BUCKET_NAME` is set.