
Azure Resource Health Diagnose
Run a structured Azure health workflow on a named resource—best practices, discovery, logs/telemetry—and output a remediation plan via Azure MCP-first tooling.
Overview
azure-resource-health-diagnose is an agent skill most often used in Operate (also Ship launch, Build backend) that analyzes Azure resource health from logs and telemetry and produces a remediation plan.
Install
npx skills add https://github.com/github/awesome-copilot --skill azure-resource-health-diagnoseWhat is this skill?
- Step 1 loads Azure diagnostic best practices via MCP before touching the resource
- Discovers resources by name across subscriptions when resource group is unknown
- Prefers azmcp-* Azure MCP tools with Azure CLI fallback for lookup and telemetry
- Requires authenticated Azure MCP and a deployed resource that emits logs
- Ends with a comprehensive remediation plan for issues found in analysis
Adoption & trust: 8.4k installs on skills.sh; 34.6k GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You know an Azure resource is misbehaving but lack a repeatable MCP-driven path from discovery through logs to concrete fixes.
Who is it for?
Indie operators with Azure MCP wired up who need structured triage on a specific live resource.
Skip if: Greenfield architecture with no deployed resource, environments without Azure MCP auth, or non-Azure clouds.
When should I use this skill?
You need to analyze Azure resource health, diagnose issues from logs and telemetry, and create a remediation plan for identified problems.
What do I get? / Deliverables
You receive a diagnosis informed by Azure best practices and telemetry plus a prioritized remediation plan for identified problems.
- Health and issue diagnosis summary from MCP-guided analysis
- Remediation plan for problems identified in the workflow
Recommended Skills
Journey fit
Spans multiple journey phases - primary shelf plus alternate fits below.
Production issue diagnosis and remediation planning belongs on the Operate shelf even when you run it right after a failed deploy. The skill centers on health status, logs, and telemetry interpretation—monitoring and diagnosis, not greenfield provisioning.
Where it fits
Triage a production App Service that errors after a config change using MCP discovery and log analysis.
Correlate telemetry on a failing Function app and draft remediation steps before rolling back.
Post-deploy smoke failure on staging—run health workflow before promoting to production.
Validate that a new Azure resource integration emits expected logs before you mark the feature done.
How it compares
Use as a multi-step diagnostic workflow skill, not a one-shot MCP tool listing or generic bash az scripts without remediation framing.
Common Questions / FAQ
Who is azure-resource-health-diagnose for?
Solo builders and small teams operating Azure-hosted apps who want agent-guided health analysis and fix planning with MCP-first tooling.
When should I use azure-resource-health-diagnose?
In Operate monitoring during incidents; after Ship deploys when health checks fail; in Build when validating staging backends—once the resource exists and emits logs.
Is azure-resource-health-diagnose safe to install?
It assumes live Azure access and read/analysis operations; check Security Audits on this page and scope MCP credentials to least privilege before diagnosis.
SKILL.md
READMESKILL.md - Azure Resource Health Diagnose
# Azure Resource Health & Issue Diagnosis This workflow analyzes a specific Azure resource to assess its health status, diagnose potential issues using logs and telemetry data, and develop a comprehensive remediation plan for any problems discovered. ## Prerequisites - Azure MCP server configured and authenticated - Target Azure resource identified (name and optionally resource group/subscription) - Resource must be deployed and running to generate logs/telemetry - Prefer Azure MCP tools (`azmcp-*`) over direct Azure CLI when available ## Workflow Steps ### Step 1: Get Azure Best Practices **Action**: Retrieve diagnostic and troubleshooting best practices **Tools**: Azure MCP best practices tool **Process**: 1. **Load Best Practices**: - Execute Azure best practices tool to get diagnostic guidelines - Focus on health monitoring, log analysis, and issue resolution patterns - Use these practices to inform diagnostic approach and remediation recommendations ### Step 2: Resource Discovery & Identification **Action**: Locate and identify the target Azure resource **Tools**: Azure MCP tools + Azure CLI fallback **Process**: 1. **Resource Lookup**: - If only resource name provided: Search across subscriptions using `azmcp-subscription-list` - Use `az resource list --name <resource-name>` to find matching resources - If multiple matches found, prompt user to specify subscription/resource group - Gather detailed resource information: - Resource type and current status - Location, tags, and configuration - Associated services and dependencies 2. **Resource Type Detection**: - Identify resource type to determine appropriate diagnostic approach: - **Web Apps/Function Apps**: Application logs, performance metrics, dependency tracking - **Virtual Machines**: System logs, performance counters, boot diagnostics - **Cosmos DB**: Request metrics, throttling, partition statistics - **Storage Accounts**: Access logs, performance metrics, availability - **SQL Database**: Query performance, connection logs, resource utilization - **Application Insights**: Application telemetry, exceptions, dependencies - **Key Vault**: Access logs, certificate status, secret usage - **Service Bus**: Message metrics, dead letter queues, throughput ### Step 3: Health Status Assessment **Action**: Evaluate current resource health and availability **Tools**: Azure MCP monitoring tools + Azure CLI **Process**: 1. **Basic Health Check**: - Check resource provisioning state and operational status - Verify service availability and responsiveness - Review recent deployment or configuration changes - Assess current resource utilization (CPU, memory, storage, etc.) 2. **Service-Specific Health Indicators**: - **Web Apps**: HTTP response codes, response times, uptime - **Databases**: Connection success rate, query performance, deadlocks - **Storage**: Availability percentage, request success rate, latency - **VMs**: Boot diagnostics, guest OS metrics, network connectivity - **Functions**: Execution success rate, duration, error frequency ### Step 4: Log & Telemetry Analysis **Action**: Analyze logs and telemetry to identify issues and patterns **Tools**: Azure MCP monitoring tools for Log Analytics queries **Process**: 1. **Find Monitoring Sources**: - Use `azmcp-monitor-workspace-list` to identify Log Analytics workspaces - Locate Application Insights instances associated with the resource - Identify relevant log tables using `azmcp-monitor-table-list` 2. **Execute Diagnostic Queries**: Use `azmcp-monitor-log-query` with targeted KQL queries based on resource type: **General Error Analysis**: ```kql // Recent errors and exceptions union isfuzzy=true