Observability

Name: Observability
Author: itallstartedwithaidea

itallstartedwithaidea/agent-skills

Instrument Cloudflare Workers and edge apps with structured logs, traces, and error alerts so you can debug production without traditional APM agents.

Install

npx skills add https://github.com/itallstartedwithaidea/agent-skills --skill observability

What is this skill?

Structured JSON logging with correlation IDs for ephemeral Worker executions
Request-scoped distributed tracing with spans, timing, and metadata
Error boundaries and explicit capture with stack traces and request context
Monitoring pillar: error-rate tracking, latency percentiles, and alert thresholds
Designed for edge constraints: no log files, no injected APM—application-layer instrumentation only

Adoption & trust: 1 installs on skills.sh; 18 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Recommended Skills

Azure Deploymicrosoft/azure-skills

Azure Deploy is a Microsoft agent skill that executes cloud releases for applications that are already planned and valid…374k installs·1.2k stars

Azure Preparemicrosoft/azure-skills

Azure Prepare is Microsoft's skill for getting applications ready to run on Azure—writing the deployment plan, generatin…374k installs·1.2k stars

Azure Storagemicrosoft/azure-skills

Azure Storage skill helps agents pick the right Azure storage service—Blob for objects, Files for SMB shares, Queues for…374k installs·1.2k stars

Azure Validatemicrosoft/azure-skills

Microsoft-guided preflight validation for Azure deployments including IaC, identity, and service-specific readiness.374k installs·1.2k stars

Appinsights Instrumentationmicrosoft/azure-skills

appinsights-instrumentation is a Microsoft Azure-skills package that walks solo builders through enabling Application In…374k installs·1.2k stars

Azure Resource Lookupmicrosoft/azure-skills

Azure Resource Lookup is a Microsoft agent skill that helps solo builders and small teams answer “what do I have in Azur…373k installs·1.2k stars

Journey fit

Primary fit

OperateMonitoring & observability

Operate is the canonical shelf because the skill’s payoff is production visibility—error rates, latency, and alerting—after code is live on Workers. Monitoring matches the three-pillar focus on tracking behavior, percentiles, and thresholds rather than initial feature build.

Common Questions / FAQ

Is Observability safe to install?

skills.sh reports 3 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.

SKILL.md

READMESKILL.md - Observability

# Observability

Part of [Agent Skills™](https://github.com/itallstartedwithaidea/agent-skills) by [googleadsagent.ai™](https://googleadsagent.ai)

## Description

Observability implements structured logging, distributed tracing, and error monitoring for Cloudflare Workers and edge applications. The agent instruments code with contextual log entries, trace spans, error boundaries, and alerting rules that provide full visibility into production behavior without sacrificing performance.

Workers present unique observability challenges. There are no persistent processes to attach profilers to, no filesystem for log files, and no APM agent to inject. Observability must be built into the application layer through structured log events shipped to external collectors, request-scoped trace contexts, and explicit error capture with stack traces and request metadata.

This skill covers three pillars: **Logging** (structured JSON events with correlation IDs), **Tracing** (request-scoped spans with timing and metadata), and **Monitoring** (error rate tracking, latency percentiles, and alerting thresholds). Together, they answer the three questions of production debugging: what happened, how long did it take, and how often does it fail.

## Use When

- Adding logging to Workers or edge applications
- Debugging production issues with distributed request tracing
- Setting up error monitoring and alerting
- Implementing health check endpoints
- Measuring and reporting latency percentiles
- Building dashboards for operational visibility

## How It Works

```mermaid
graph TD
    A[Incoming Request] --> B[Generate Trace ID]
    B --> C[Start Root Span]
    C --> D[Execute Handler]
    D --> E[Child Spans for Subrequests]
    E --> F{Error?}
    F -->|Yes| G[Capture Error + Context]
    F -->|No| H[Record Success Metrics]
    G --> I[Structured Log: ERROR]
    H --> J[Structured Log: INFO]
    I --> K[Ship to Collector via waitUntil]
    J --> K
    K --> L[External: Datadog / Grafana / Sentry]
    C --> M[End Root Span + Record Latency]
    M --> K
```

Every request receives a trace ID that propagates through all subrequests and log entries. Spans measure duration of individual operations. Logs and spans are batched and shipped asynchronously via `waitUntil()` to avoid adding latency to the response.

## Implementation

```typescript
interface LogEntry {
  timestamp: string;
  level: "debug" | "info" | "warn" | "error";
  traceId: string;
  spanId: string;
  message: string;
  data?: Record<string, unknown>;
  error?: { name: string; message: string; stack?: string };
  duration_ms?: number;
}

class RequestTracer {
  private traceId: string;
  private spans: LogEntry[] = [];
  private startTime: number;

  constructor(request: Request) {
    this.traceId = request.headers.get("x-trace-id") ?? crypto.randomUUID();
    this.startTime = performance.now();
  }

  span<T>(name: string, fn: () => Promise<T>): Promise<T> {
    const spanId = crypto.randomUUID().slice(0, 8);
    const start = performance.now();
    return fn().then(
      result => {
        this.log("info", name, { duration_ms: performance.now() - start }, spanId);
        return result;
      },
      error => {
        this.log("error", name, {
          duration_ms: performance.now() - start,
          error: { name: error.name, message: error.message, stack: error.stack },
        }, spanId);
        throw error;
      }
    );
  }

  log(level: LogEntry["level"], message: string, data?: Record<string, unknown>, spanId?: string): void {
    this.spans.push({
      timestamp: new Date().toISOString(),
      level,
      traceId: this.traceId,
      spanId: spanId ?? "root",
      message,
      ...data,
    });
  }

  async flush(env: { LOG_COLLECTOR: Fetcher }): Promise<void> {
    const rootDuration =

What is this skill?

Structured JSON logging with correlation IDs for ephemeral Worker executions

Request-scoped distributed tracing with spans, timing, and metadata

Error boundaries and explicit capture with stack traces and request context

Monitoring pillar: error-rate tracking, latency percentiles, and alert thresholds

Designed for edge constraints: no log files, no injected APM—application-layer instrumentation only

Adoption & trust: 1 installs on skills.sh; 18 GitHub stars; 3/3 security scanners passed (skills.sh audits); trending (+100% hot-view momentum).

Journey fit

Primary fit

OperateMonitoring & observability

SKILL.md

READMESKILL.md - Observability

# Observability

Part of [Agent Skills™](https://github.com/itallstartedwithaidea/agent-skills) by [googleadsagent.ai™](https://googleadsagent.ai)

## Description

Observability implements structured logging, distributed tracing, and error monitoring for Cloudflare Workers and edge applications. The agent instruments code with contextual log entries, trace spans, error boundaries, and alerting rules that provide full visibility into production behavior without sacrificing performance.

Workers present unique observability challenges. There are no persistent processes to attach profilers to, no filesystem for log files, and no APM agent to inject. Observability must be built into the application layer through structured log events shipped to external collectors, request-scoped trace contexts, and explicit error capture with stack traces and request metadata.

This skill covers three pillars: **Logging** (structured JSON events with correlation IDs), **Tracing** (request-scoped spans with timing and metadata), and **Monitoring** (error rate tracking, latency percentiles, and alerting thresholds). Together, they answer the three questions of production debugging: what happened, how long did it take, and how often does it fail.

## Use When

- Adding logging to Workers or edge applications
- Debugging production issues with distributed request tracing
- Setting up error monitoring and alerting
- Implementing health check endpoints
- Measuring and reporting latency percentiles
- Building dashboards for operational visibility

## How It Works

```mermaid
graph TD
    A[Incoming Request] --> B[Generate Trace ID]
    B --> C[Start Root Span]
    C --> D[Execute Handler]
    D --> E[Child Spans for Subrequests]
    E --> F{Error?}
    F -->|Yes| G[Capture Error + Context]
    F -->|No| H[Record Success Metrics]
    G --> I[Structured Log: ERROR]
    H --> J[Structured Log: INFO]
    I --> K[Ship to Collector via waitUntil]
    J --> K
    K --> L[External: Datadog / Grafana / Sentry]
    C --> M[End Root Span + Record Latency]
    M --> K
```

Every request receives a trace ID that propagates through all subrequests and log entries. Spans measure duration of individual operations. Logs and spans are batched and shipped asynchronously via `waitUntil()` to avoid adding latency to the response.

## Implementation

```typescript
interface LogEntry {
  timestamp: string;
  level: "debug" | "info" | "warn" | "error";
  traceId: string;
  spanId: string;
  message: string;
  data?: Record<string, unknown>;
  error?: { name: string; message: string; stack?: string };
  duration_ms?: number;
}

class RequestTracer {
  private traceId: string;
  private spans: LogEntry[] = [];
  private startTime: number;

  constructor(request: Request) {
    this.traceId = request.headers.get("x-trace-id") ?? crypto.randomUUID();
    this.startTime = performance.now();
  }

  span<T>(name: string, fn: () => Promise<T>): Promise<T> {
    const spanId = crypto.randomUUID().slice(0, 8);
    const start = performance.now();
    return fn().then(
      result => {
        this.log("info", name, { duration_ms: performance.now() - start }, spanId);
        return result;
      },
      error => {
        this.log("error", name, {
          duration_ms: performance.now() - start,
          error: { name: error.name, message: error.message, stack: error.stack },
        }, spanId);
        throw error;
      }
    );
  }

  log(level: LogEntry["level"], message: string, data?: Record<string, unknown>, spanId?: string): void {
    this.spans.push({
      timestamp: new Date().toISOString(),
      level,
      traceId: this.traceId,
      spanId: spanId ?? "root",
      message,
      ...data,
    });
  }

  async flush(env: { LOG_COLLECTOR: Fetcher }): Promise<void> {
    const rootDuration =

Install

What is this skill?

Recommended Skills

Journey fit

Is Observability safe to install?

SKILL.md

This week for builders

Install

What is this skill?

Recommended Skills

Journey fit

Is Observability safe to install?

SKILL.md