
astronomer/agents
18 skills13.5k installs6.9k starsGitHub
Install
npx skills add https://github.com/astronomer/agentsSkills in this repo
1Analyzing DataAnalyzing-data is an agent skill for warehouse-backed business Q&A: who uses a feature, how many signups, trends, and ad hoc SQL. It targets solo and indie builders who already have a data warehouse and want the agent to follow a disciplined loop—check cached patterns and concepts, discover tables when needed, run queries through a local CLI kernel, and write learnings back before answering. That reduces repeated schema archaeology and inconsistent SQL across sessions. You invoke it when questions require database truth rather than product code alone. It pairs naturally with analytics dashboards and growth experiments where definitions must stay consistent. Requires cd’ing into the skill directory and using uv against bundled scripts; discovery reference docs cover warehouse-specific search paths.978installs2AirflowAirflow is an agent skill for solo builders and small teams who already run scheduled data or ML pipelines on Apache Airflow and need fast operational answers inside Claude Code, Cursor, or Codex. It standardizes day-two work: list DAGs, trigger runs, tail task logs, inspect connections and variables, and interpret health signals when something breaks in staging or production. The skill leans on the af command surface for inspect-and-fix loops and points to Astronomer’s Astro CLI for a batteries-included local stack—init a project, start the web UI, parse DAGs without booting the full stack, pytest DAG integrity, and push DAGs or full images to a remote deployment. When you are authoring net-new workflows or doing platform upgrades, it hands off to companion skills for writing DAGs, deeper debugging, deploy runbooks, and migrating from Airflow 2 to 3. Install it when Airflow vocabulary shows up in chat and you want procedural commands instead of guessing REST endpoints or clicking through the UI alone.955installs3Authoring DagsAuthoring DAGs is an Astronomer agent skill that walks solo builders through writing Apache Airflow DAGs using a structured Discover, Plan, and Implement flow plus the `af` CLI. It is meant when users ask to create a new DAG, write pipeline code, or learn DAG patterns—not when they are only debugging test failures (that path is delegated to testing-dags). The skill stresses understanding the codebase and environment first, proposing structure for approval, then implementing against established conventions. CLI access can come from Astro Otto on PATH or a standalone install. A completion hook nudges agents toward the testing-dags skill so pipelines are validated before production. This fits indie data products and SaaS backends that rely on scheduled ETL or workflow orchestration during the build phase.818installs4Debugging DagsDebugging DAGs is an agent skill for solo builders and small data teams running Apache Airflow who need more than a one-line failure reason. It walks a data-engineer-style investigation: locate the failing DAG run, pull task logs through the af CLI, classify the failure, and suggest fixes that reduce recurrence. Prism places it on Operate because most invocations follow production or staging scheduler alerts, but the same workflow helps during Ship when validating pipelines and during Build when DAG import errors block deployment. The skill explicitly defers lightweight questions to the airflow entrypoint and reserves this path for phrases like full root cause analysis or diagnose and fix the pipeline. Expect shell access to run af against your Astro or Airflow environment and comfort reading Python operator stack traces inside scheduler logs.806installs5Migrating Airflow 2 To 3Migrating Airflow 2 to 3 is a phase-specific agent skill from Astronomer for solo builders and small data teams who run scheduled pipelines on Apache Airflow and must move off 2.x before support and compatibility pressure wins. It centers on code-level migration—imports, operators, hooks, task context, and API usage—rather than greenfield DAG authoring. The workflow starts with Ruff's Airflow migration selectors to auto-fix detectable patterns, then uses a documented manual search checklist for issues Ruff cannot resolve. The skill explicitly warns against risky jump paths: upgrade to 2.11 first, then to at least 3.0.11 with a strong preference for 3.1 to avoid rollback traps and early-3.0 bugs. Use it when users mention Airflow 3 migration, breaking changes, or when you detect 2.x code that should be modernized—after prompting for consent to upgrade.800installs6Testing Dagstesting-dags is an Astronomer agent skill for solo builders and small teams who maintain Airflow DAGs and need more than a single test command. Install it when you ask the agent to test a DAG and fix failures, debug a failing pipeline run, or work through test-and-troubleshoot loops. The workflow distinguishes complex iterative work from simple requests: one-liners like test dag or run dag are meant for the airflow entrypoint skill, while this package owns multi-step cycles. It standardizes on the af CLI—preferably through astro otto—and optionally uses Astro CLI parse and pytest for fast feedback before hitting a live scheduler. The documented first move is always trigger-wait on the target dag_id, avoiding noisy pre-flight listing so you get signal from the run itself. That pattern fits indie operators shipping data products or internal ETL without a dedicated data platform team.791installs7Tracing Upstream LineageTracing Upstream Lineage is an Astronomer agent skill that teaches a repeatable method to answer where does this data come from for tables, columns, or entire DAGs. Solo builders and small data teams on Airflow use it when debugging stale metrics, broken dashboards, or unclear ownership between pipelines. The workflow starts by classifying the target, then locating the producing DAG through naming patterns and af CLI commands, then reading task logic for load statements and upstream reads. On Astro, the Lineage tab accelerates cross-DAG dependency discovery; on open-source Airflow, the same outcomes require DAG source and task log inspection. It is phase-specific operations work for pipeline-backed SaaS and API products where analytics tables must map cleanly to orchestration code, not a general brainstorming or planning methodology.759installs8Tracing Downstream LineageTracing Downstream Lineage is an agent skill for data and platform solo builders who run Airflow-style pipelines and need disciplined impact analysis before altering a table, view, or DAG. It instructs the agent to enumerate direct consumers by searching orchestration source for FROM and JOIN references, querying warehouse metadata for views that depend on the target, and watching for reporting tables that imply dashboard usage. The workflow stresses running this analysis before modifications so you understand blast radius instead of discovering broken dashboards after deploy. On Astro, it points to the Lineage tab as a faster visual complement to manual DAG source grepping. The skill fits operators and analytics engineers maintaining shared datasets, not one-off notebook exploration. Pair it with your org's actual CLI aliases, repo layout, and governance rules; outputs are an impact inventory and risk notes you can attach to a change ticket.743installs9Profiling TablesProfiling Tables is an agent skill that turns a single table name into a repeatable data-profile report using INFORMATION_SCHEMA and aggregate SQL. Solo builders and tiny data teams use it when onboarding to an unfamiliar warehouse table, debugging suspicious metrics, or documenting data quality before building dbt models or API features. The workflow sequences basic metadata, size and shape, then column-level statistics with different query shapes for numeric versus string fields, all intended to be executed through a run_sql capability. It does not replace full pipeline observability or automated DQ frameworks, but it gives citable counts and distributions you can paste into specs or PRs. Invoke it whenever someone asks to profile a table, understand dataset statistics, or assess structure and content quality.742installs10Managing Astro Local EnvManaging Astro Local Env is for data-minded solo builders and small teams who run Apache Airflow locally through Astronomer’s CLI instead of guessing Docker commands. It documents two execution paths: containerized `astro dev` (default) and standalone mode for Airflow 3 without Docker when `uv` is available. You get copy-paste commands to start and stop the stack, restart individual components, wipe volumes for a clean slate, and know when to bounce the environment after dependency or image changes. The skill explicitly defers new project scaffolding to setting-up-astro-project and points to authoring-dags and testing-dags once the webserver is up. Expect intermediate familiarity with terminals, optional Docker, and Airflow’s local API and logs when debugging failed parses or scheduler issues.737installs11Setting Up Astro ProjectSetting Up Astro Project is an Astronomer agent skill that gets solo builders from zero to a conventional Astro CLI repository layout without version pins that silently age out. The default command is astro dev init, which materializes dags, include, plugins, tests, Dockerfile, and package manifests—the folders your agent and teammates need before any scheduler logic exists. The skill deliberately tells agents not to pass airflow-version or runtime-version flags unless the user asks, then verify what was installed from the generated Dockerfile instead of hallucinating pins from training data. It situates Astro as the managed path while naming Apache Airflow Docker Compose for local open-source dev and the Helm chart for Kubernetes production, with explicit handoffs to managing-astro-local-env, authoring-dags, and deploying-airflow. Ideal when you are adding batch ETL, analytics refreshes, or ML feature pipelines to a SaaS backend and want reproducible project bootstrap rather than ad-hoc folder creation.730installs12Checking FreshnessChecking Freshness is an agent skill for solo builders and small data teams who need to know whether a table is safe to query before shipping metrics, running downstream jobs, or answering stakeholder questions. It walks through finding the right timestamp column (ETL load markers versus application updated_at fields), running a last-update query with elapsed time math, and optionally validating recent row volume over the past week. The skill encodes a simple reporting scale so agents state Fresh, Acceptable, Stale, or Critical instead of vague guesses. It fits operators maintaining Airflow-style pipelines, analytics engineers validating dbt marts, and indie SaaS founders debugging why a revenue chart looks wrong. Invoke it whenever someone asks if data is up to date, when a table was last updated, or whether figures are stale before a demo or launch review.723installs13Annotating Task LineageAnnotating Task Lineage guides solo builders and small data teams through manual Airflow lineage using task inlets and outlets when built-in OpenLineage extraction is missing or insufficient. It fits the Build phase while you wire DAGs to warehouses, lakes, and downstream consumers, and it pairs with the Apache Airflow OpenLineage provider patterns documented for operator maintainers. The skill includes a decision table: prefer modifying get_openlineage_facets_on_* hooks when they exist; use inlets/outlets for quick table-level graphs, operators without extractors, and setups that avoid custom code. It explicitly defers column-level lineage and heavy custom extraction to OpenLineage methods or dedicated extractors. On Astronomer Astro, the same annotations surface in the enhanced Lineage tab for organization-wide cross-DAG views—useful when you need a credible data-flow map for compliance conversations without standing up a separate lineage project first.683installs14Cosmos Dbt CoreCosmos dbt Core is a workflow skill for solo builders and small data teams who already have dbt Core models and need them running under Apache Airflow via Astronomer Cosmos. Before any code, it forces verification of dbt engine (Core not Fusion), warehouse type, Airflow major version, where tasks execute, whether you want a full DAG or TaskGroup, and whether a compiled manifest is available. The body walks through ProjectConfig and subsequent Cosmos configuration in order, favoring the simplest setup that meets constraints. It aligns with PyPI’s astronomer-cosmos package and documents version assumptions so you do not mix Fusion guidance with Core. Use it when you are building the integration layer—not when choosing analytics semantics or warehouse modeling from scratch.678installs15Airflow Hitlairflow-hitl is an agent skill for solo builders and small teams who run Apache Airflow 3.1 or newer and need real human gates in production DAGs—approve/reject with skip-on-reject, multi-select options, branch routing, structured forms, or advanced HITLTrigger usage. It walks a two-step ritual: pick the capability table, then discover the live operator signature from the registry before writing tasks, because Astronomer’s provider renames parameters between releases. Operators are deferrable, so workers are not tied up while someone acts in the UI or REST API under Required Actions. The skill cross-references airflow for registry commands and airflow-ai when the job is LLM work, not human input. Ideal when compliance, ops, or product owners must sign off inside the same orchestration graph you already ship.664installs16Creating Openlineage Extractorscreating-openlineage-extractors is a practitioner skill for solo builders and small data teams who run Apache Airflow and need trustworthy lineage when built-in operator support stops. It walks through when to add OpenLineage methods on operators you maintain versus building a custom extractor for third-party packages, and when column-level or complex logic demands more than simple inlets and outlets. The guidance favors the simplest maintainable approach because extractors drift from operator behavior and are harder to debug. Astronomer-oriented notes clarify that on Astro, lineage events are already collected without extra transport setup, so your work focuses on extraction correctness. Use it while wiring new DAG tasks, wrapping vendor operators, or standardizing observability before production cutover.664installs17Cosmos Dbt FusionCosmos dbt Fusion is a configuration reference skill from Astronomer for solo builders and tiny data teams running dbt Fusion inside Apache Airflow via Cosmos. Fusion’s public beta constraints matter: only LOCAL execution against Snowflake or Databricks warehouses, so the skill documents the exact ProfileConfig patterns—profile name, target name, Airflow conn_id, and optional profile_args like schema—and points to operator_args and Airflow 3 compatibility. It is not a full dbt modeling course; it is the cheat sheet you hand an agent when generating DAG code that must not invent unsupported warehouses or execution modes. Use during backend/analytics build when your product depends on scheduled transformations rather than notebook one-offs.636installs18Warehouse Initwarehouse-init is an Astronomer agent skill that runs a structured discovery pass over your configured data warehouse and materializes everything into `.astro/warehouse.md`. Solo builders and small data teams use it when they want agents to answer “which table has X?” from a checked-in reference instead of issuing broad INFORMATION_SCHEMA queries every session. The workflow reads `~/.astro/agents/warehouse.yml`, launches codebase exploration for dbt and SQL documentation, pulls live metadata through the analyzing-data CLI, and merges row-count signals so large tables are obvious. Run it once per project after warehouse credentials exist, then refresh when migrations or model renames land. The output is plain markdown your repo can diff in PRs, which makes onboarding and agent grounding much cheaper than ad-hoc warehouse browsing.554installs