
Analyzing Data
Answer product and business metrics questions by querying your warehouse with cached patterns instead of guessing SQL every time.
Install
npx skills add https://github.com/astronomer/agents --skill analyzing-dataWhat is this skill?
- Five-step workflow: pattern lookup, concept lookup, discovery, exec via cli.py, then cache learnings
- Pattern and concept caches via uv run scripts/cli.py to reuse query strategies and table mappings
- Kernel auto-starts on first exec for run_sql and pandas-style result inspection
- INFORMATION_SCHEMA and repo SQL grep fallback when cache misses
- Explicit cache-success/failure recording to improve future pattern hits
Adoption & trust: 978 installs on skills.sh; 384 GitHub stars; 2/3 security scanners passed (skills.sh audits).
Recommended Skills
Journey fit
Canonical shelf is Grow because the skill turns live warehouse data into answers for counts, cohorts, and trends—not for building schemas or shipping features. Analytics is the right subphase: lookups, SQL execution, and presenting metrics are core lifecycle measurement work for solo builders tracking usage and revenue.
Common Questions / FAQ
Is Analyzing Data safe to install?
skills.sh reports 2 of 3 security scanners passed. Review the Security Audits panel on this page before installing in production.
SKILL.md
READMESKILL.md - Analyzing Data
# Data Analysis Answer business questions by querying the data warehouse. The kernel auto-starts on first `exec` call. **All CLI commands below are relative to this skill's directory.** Before running any `scripts/cli.py` command, `cd` to the directory containing this file. ## Workflow 1. **Pattern lookup** — Check for a cached query strategy: ```bash uv run scripts/cli.py pattern lookup "<user's question>" ``` If a pattern exists, follow its strategy. Record the outcome after executing: ```bash uv run scripts/cli.py pattern record <name> --success # or --failure ``` 2. **Concept lookup** — Find known table mappings: ```bash uv run scripts/cli.py concept lookup <concept> ``` 3. **Table discovery** — If cache misses, search the codebase (`Grep pattern="<concept>" glob="**/*.sql"`) or query `INFORMATION_SCHEMA`. See [reference/discovery-warehouse.md](reference/discovery-warehouse.md). 4. **Execute query**: ```bash uv run scripts/cli.py exec "df = run_sql('SELECT ...')" uv run scripts/cli.py exec "print(df)" ``` 5. **Cache learnings** — Always cache before presenting results: ```bash # Cache concept → table mapping uv run scripts/cli.py concept learn <concept> <TABLE> -k <KEY_COL> # Cache query strategy (if discovery was needed) uv run scripts/cli.py pattern learn <name> -q "question" -s "step" -t "TABLE" -g "gotcha" ``` 6. **Present findings** to user. ## Kernel Functions | Function | Returns | |----------|---------| | `run_sql(query, limit=100)` | Polars DataFrame | | `run_sql_pandas(query, limit=100)` | Pandas DataFrame | `pl` (Polars) and `pd` (Pandas) are pre-imported. ## CLI Reference ### Kernel ```bash uv run scripts/cli.py warehouse list # List warehouses uv run scripts/cli.py start [-w name] # Start kernel (with optional warehouse) uv run scripts/cli.py exec "..." # Execute Python code uv run scripts/cli.py status # Kernel status uv run scripts/cli.py restart # Restart kernel uv run scripts/cli.py stop # Stop kernel uv run scripts/cli.py install <pkg> # Install package ``` ### Concept Cache ```bash uv run scripts/cli.py concept lookup <name> # Look up uv run scripts/cli.py concept learn <name> <TABLE> -k <KEY_COL> # Learn uv run scripts/cli.py concept list # List all uv run scripts/cli.py concept import -p /path/to/warehouse.md # Bulk import ``` ### Pattern Cache ```bash uv run scripts/cli.py pattern lookup "question" # Look up uv run scripts/cli.py pattern learn <name> -q "..." -s "..." -t "TABLE" -g "gotcha" # Learn uv run scripts/cli.py pattern record <name> --success # Record outcome uv run scripts/cli.py pattern list # List all uv run scripts/cli.py pattern delete <name> # Delete ``` ### Table Schema Cache ```bash uv run scripts/cli.py table lookup <TABLE> # Look up schema uv run scripts/cli.py table cache <TABLE> -c '[...]' # Cache schema uv run scripts/cli.py table list # List cached uv run scripts/cli.py table delete <TABLE> # Delete ``` ### Cache Management ```bash uv run scripts/cli.py cache status # Stats uv run scripts/cli.py cache clear [--stale-only] # Clear ``` ## References - [reference/discovery-warehouse.md](reference/discovery-warehouse.md) — Large table handling, warehouse exploration, INFORMATION_SCHEMA queries - [reference/common-patterns.md](reference/common-patterns.md) — SQL templates