E2e Medallion Architecture

Name: E2e Medallion Architecture
Author: microsoft

microsoft/skills-for-fabric

141 installs
886 repo stars
Updated July 23, 2026
microsoft/skills-for-fabric

About

>. > **Update Check — ONCE PER SESSION (mandatory)** > The first time this skill is used in a session, run the **check-updates** skill before proceeding. > - **GitHub Copilot CLI / VS Code**: invoke the `check-updates` skill. > - **Claude Code / Cowork / Cursor / Windsurf / Codex**: compare local vs remote package.json version. > - Skip if the check was already performed earlier in this session.

> **Update Check — ONCE PER SESSION (mandatory)**
> The first time this skill is used in a session, run the **check-updates** skill before proceeding.
> - **GitHub Copilot CLI / VS Code**: invoke the `check-updates` skill.
> - **Claude Code / Cowork / Cursor / Windsurf / Codex**: compare local vs remote package.json version.
> - Skip if the check was already performed earlier in this session.

E2e Medallion Architecture by the numbers

141 all-time installs (skills.sh)
Ranked #463 of 1,453 DevOps & CI/CD skills by installs in the Skillselion catalog
Security screen: LOW risk (skills.sh audit)
Data as of Jul 28, 2026 (Skillselion catalog sync)

At a glance

e2e-medallion-architecture capabilities & compatibility

Capabilities: > **update check — once per session (mandatory)* · > the first time this skill is used in a session · > **github copilot cli / vs code**: invoke the · > **claude code / cowork / cursor / windsurf /
Use cases: documentation

From the docs

What e2e-medallion-architecture says it does

>

SKILL.md

npx skills add https://github.com/microsoft/skills-for-fabric --skill e2e-medallion-architecture

Add your badge

Show developers this skill is listed on Skillselion. Paste this into your README.

[![Listed on Skillselion](https://skillselion.com/badge/skills/microsoft/skills-for-fabric/e2e-medallion-architecture.svg)](https://skillselion.com/skills/microsoft/skills-for-fabric/e2e-medallion-architecture)

Installs	141
repo stars	★ 886
Security audit	3 / 3 scanners passed
Last updated	July 23, 2026
Repository	microsoft/skills-for-fabric ↗

How do I apply e2e-medallion-architecture using the workflow in its SKILL.md?

Who is it for?

Developers following the e2e-medallion-architecture skill for the tasks it documents.

Skip if: Tasks outside the e2e-medallion-architecture scope described in SKILL.md.

When should I use this skill?

User mentions e2e-medallion-architecture or related triggers from the skill description.

What you get

Working e2e-medallion-architecture setup aligned with the documented patterns and constraints.

Files

SKILL.mdMarkdownGitHub ↗

Update Check — ONCE PER SESSION (mandatory)

The first time this skill is used in a session, run the check-updates skill before proceeding.

- GitHub Copilot CLI / VS Code: invoke the check-updates skill.

- Claude Code / Cowork / Cursor / Windsurf / Codex: compare local vs remote package.json version.

- Skip if the check was already performed earlier in this session.

CRITICAL NOTES

1. To find the workspace details (including its ID) from workspace name: list all workspaces and, then, use JMESPath filtering

2. To find the item details (including its ID) from workspace ID, item type, and item name: list all items of that type in that workspace and, then, use JMESPath filtering

End-to-End Medallion Architecture

Prerequisite Knowledge

Read these companion documents — they contain the foundational context this skill depends on:

COMMON-CORE.md — Fabric REST API patterns, authentication, token audiences, item discovery
COMMON-CLI.md — az rest, az login, token acquisition, Fabric REST via CLI
SPARK-AUTHORING-CORE.md — Notebook deployment, lakehouse creation, job execution
notebook-api-operations.md — Required for notebook creation — .ipynb structure requirements, cell format, getDefinition/updateDefinition workflow

For Spark-specific optimization details, see data-engineering-patterns.md.

---

Architecture Overview

Medallion Architecture is a data lakehouse pattern with three progressive layers:

Layer	Purpose	Optimization Profile	Use Case
Bronze (Raw)	Land raw data exactly as received	Write-optimized, append-only, partitioned by ingestion date	Audit trail, reprocessing, lineage
Silver (Cleaned)	Deduplicated, validated, conformed data	Balanced read/write, partitioned by business date	Feature engineering, operational reporting
Gold (Aggregated)	Pre-calculated metrics for analytics	Read-optimized (ZORDER, compaction), partitioned by month/year	Power BI reports, dashboards, ad-hoc analytics via SQL endpoint

Bronze: Schema-on-read — flexible schema, Delta time travel supports audit and rollback
Silver: Schema enforcement — reject non-conforming writes; handle schema evolution with mergeSchema when sources change
Gold: Strict schema governance — curated, business-approved datasets only

---

Must/Prefer/Avoid

MUST DO

Choose lakehouse architecture based on schema-enabled availability (see infrastructure-orchestration.md):
Preferred: Schema-enabled lakehouse → create ONE workspace + ONE lakehouse with bronze, silver, gold schemas
Legacy: Non-schema-enabled → create separate workspaces per layer (Bronze, Silver, Gold) for governance and access control
Use Livy API for schema and table creation — to create schemas and tables in a schema-enabled lakehouse, submit Spark SQL statements via Livy sessions (POST /livyApi/versions/2023-12-01/sessions → POST .../statements). This is the only programmatic REST path for DDL operations (CREATE SCHEMA, CREATE TABLE) in Fabric lakehouses.
Add metadata columns in Bronze: ingestion timestamp, source file, batch ID
Apply data quality rules in the Bronze-to-Silver transformation (deduplication, null handling, range validation)
Use Delta Lake format for all medallion layer tables
Use partition-aware overwrite in Silver/Gold writes to avoid reprocessing unchanged data
Include validation steps after each layer (row counts, schema checks, anomaly detection)
Follow the `.ipynb` validation + Fabric nuances in notebook-api-operations.md when creating notebooks via REST API — every code cell must include "outputs": [] and "execution_count": null
Complete the full end-to-end flow — do not stop after creating notebooks; always bind lakehouses, execute notebooks sequentially (Bronze → Silver → Gold), verify results, and connect Power BI to the Gold layer unless the user explicitly requests a partial setup

PREFER

Incremental processing (watermark pattern) over full refresh
Separate notebooks per layer for independent testing and debugging
ZORDER on frequently filtered columns in Gold tables
Running OPTIMIZE after writes in Silver and Gold layers
Environment-specific Spark configs (write-heavy for Bronze, balanced for Silver, read-heavy for Gold)
OneLake shortcuts to expose Gold data to consumer workspaces without duplication
Clear layer ownership: engineers own Bronze/Silver, analysts own Gold
Fabric Variable Libraries to centralize paths and configuration across layers
Multi-workspace deployment patterns for medium/high governance requirements (Bronze/Silver/Gold in separate workspaces)
Use Materialized Lake Views (MLVs) for Silver/Gold tables when the transformation is expressible in Spark SQL and benefits from declarative refresh semantics. See spark-authoring-cli — Materialized Lake View patterns and MLV incremental refresh patterns.
For MLV refresh scheduling and monitoring (create/delete schedules, trigger on-demand refresh, check job status), use mlv-operations-cli. Note: "materialized view", "spark materialized view", and "MLV" all refer to the same Fabric feature.

AVOID

Storing all layers in a single lakehouse WITHOUT schemas — non-schema lakehouses require notebook init cells or Environment configuration to enable OneLake Spark Catalog for RLS/CLS and MLVs. Use separate lakehouses for isolation if schemas aren't available.
Creating 3 separate lakehouses when schema-enabled lakehouse is available — use schemas within one lakehouse instead (cleaner, no boilerplate init cells, more efficient for MLV cross-schema transformations)
Skipping the Silver layer and going directly from Bronze to Gold
Hardcoded workspace IDs, lakehouse IDs, or FQDNs — discover via REST API
SELECT * without LIMIT on Bronze tables (they grow unboundedly)
Running VACUUM without checking downstream dependencies
Chaining OneLake shortcuts between medallion layers (Bronze→Silver→Gold) — each layer must be physically materialized for lineage and governance
Copying complete implementation code into skills — guide the LLM to generate instead
Reading from external HTTP/HTTPS URLs directly in Spark — Fabric Spark cannot access arbitrary external URLs; land data in lakehouse Files/ first (via curl, OneLake API, or Fabric pipeline Copy activity), then read from the lakehouse path
Creating notebooks via REST API without validating `.ipynb` structure — missing execution_count: null or outputs: [] on code cells causes silent failures or "Job instance failed without detail error"

---

Workspace Setup Guidance

When setting up a medallion workspace, choose your architecture pattern first (see infrastructure-orchestration.md for detailed guidance):

Option A: Schema-Enabled Lakehouse (Preferred)

1. Create single workspace: {project}-{env} 2. Create one lakehouse with schemas: {project}_lakehouse 3. Create schemas within the lakehouse:

bronze schema for raw ingestion
silver schema for cleaned/validated data
gold schema for aggregated analytics

4. Choose transformation approach:

Option 4a: Use notebooks for each layer (PySpark or Spark SQL transformations)
Option 4b: Use Materialized Lake Views (Spark SQL) for declarative transformations with incremental refresh (when query is IR-eligible) — see materialized-lake-view-patterns.md and mlv-incremental-refresh-patterns.md
Note: PySpark MLVs exist but use full refresh only (no incremental) — use when you need UDFs/complex Python logic
MLV benefit: OneLake Spark Catalog is automatically enabled for schema-enabled lakehouses — MLVs work out-of-box with no notebook init cells or Environment configuration required

5. RBAC (optional): Use row-level security and column masking within schemas for fine-grained access control (also requires OneLake Spark Catalog)

Option B: Separate Lakehouses (Legacy)

1. Create three workspaces:

{project}-bronze-{env}
{project}-silver-{env}
{project}-gold-{env}

2. Create one lakehouse per workspace:

Bronze workspace → {project}_bronze lakehouse
Silver workspace → {project}_silver lakehouse
Gold workspace → {project}_gold lakehouse

3. Assign RBAC per layer workspace:

Bronze: ingestion/engineering write permissions
Silver: engineering/data quality permissions
Gold: analytics/BI consumer access with stricter curation controls

4. Enable OneLake Spark Catalog for non-schema lakehouses (required for RLS/CLS and catalog-backed access patterns):

Primary: Set spark.sql.fabric.catalog.enable-schemaless-lakehouses=true in an Environment and attach it to notebooks.
Alternative: Omit default lakehouse binding from notebooks. Use four-part fully-qualified references (workspace.lakehouse.schema.table). OneLake Spark Catalog auto-enables when no default lakehouse is set.
Alternative (internal/unsupported): Add this as the first cell in every notebook:

   %%pyspark
   !echo "spark.sql.fabric.catalog.enable-schemaless-lakehouses=true" >> /home/trusted-service-user/.trident-context

⚠️ Note: This workaround uses an internal runtime configuration path that may change in future Fabric releases. Prefer schema-enabled lakehouses for stable, documented OneLake Spark Catalog support.
With this configuration, non-schema lakehouses support:
✅ Row-level security (RLS) and column-level security (CLS)
Note: MLVs require schema-enabled lakehouses (Option A). For non-schema lakehouses, use notebooks with Delta tables.

Common Steps (Both Options)

After completing Option A or Option B above, perform these steps:

1. Create notebooks for each layer (one per transformation stage) — follow .ipynb validation + Fabric nuances 2. Bind each notebook to its lakehouse — set metadata.dependencies.lakehouse with the correct lakehouse ID (see notebook-api-operations.md § Default Lakehouse Binding):

Option A: All notebooks → same lakehouse, use schema prefixes (bronze.table, silver.table)
Option B:
Bronze notebook → Bronze workspace/lakehouse
Silver notebook → Silver workspace/lakehouse (reads Bronze via cross-workspace OneLake access / fully qualified references)
Gold notebook → Gold workspace/lakehouse (reads Silver via cross-workspace access)

3. Confirm notebook deployment — check that updateDefinition returned Succeeded; this is sufficient confirmation that content and lakehouse binding persisted. Do NOT call getDefinition to re-verify — it is an async LRO and adds unnecessary latency. 4. Execute notebooks sequentially — Bronze first, then Silver, then Gold — using POST .../jobs/instances?jobType=RunNotebook with the correct defaultLakehouse in execution config (both id and name required) 5. Connect Power BI to Gold layer — discover the Gold lakehouse SQL endpoint, create a Direct Lake semantic model, create a report with visuals on the Gold summary table (see Gold Layer → Power BI Consumption) 6. Create pipeline to orchestrate the Bronze → Silver → Gold flow for recurring execution

Explicit Override: Single Workspace

If the user explicitly asks for a single workspace deployment (for example, POC/small team/monolithic pattern), keep the current approach:

One workspace with separate Bronze/Silver/Gold lakehouses
Preserve layer separation logically even when workspace is shared
Call out governance trade-offs versus multi-workspace design

Parameterize by environment: workspace name suffix (-dev, -prod), data volume (sample vs full), capacity SKU, and Bronze retention period.

---

Bronze Layer — Ingestion Patterns

When a user requests data ingestion into the Bronze layer, guide LLM to:

1. Land data in lakehouse first: External data must be staged into the lakehouse Files/ folder before Spark can read it — use one of:

Fabric Pipeline Copy activity (preferred for recurring loads) — connects to external sources (HTTP, FTP, databases, cloud storage) and writes to OneLake
OneLake API / `curl` — upload files via REST API using storage.azure.com token (see COMMON-CLI.md § OneLake Data Access)
OneLake Shortcut — for data already in Azure ADLS Gen2, S3, or another OneLake location
`notebookutils.fs` — copy from mounted storage paths within a notebook
⚠️ Fabric Spark cannot read from arbitrary HTTP/HTTPS URLs — spark.read.format("csv").load("https://...") will fail

2. Read from lakehouse path: Once data is in Files/, read using lakehouse-relative paths (e.g., spark.read.format("csv").load("Files/landing/daily/")) 3. Add metadata and write: Tracking columns (ingestion timestamp, source file, batch ID), Delta table with descriptive name, partition by ingestion date, append mode 4. Validate: Log row counts, validate schema structure, flag anomalies vs historical patterns

---

Silver Layer — Transformation Patterns

When a user requests Bronze-to-Silver transformation, guide LLM to:

Quality rules: Deduplicate on natural/composite key, filter invalid ranges, handle nulls (drop required, fill optional), validate logical constraints
Schema conformance: snake_case column names, standardized data types, derived columns (durations, percentages, categories)
Schema evolution: Use mergeSchema option when source schemas change; coordinate downstream updates to Gold tables and Power BI datasets
Write strategy: Partition by business date, partition-aware overwrite, run OPTIMIZE after write, log before/after metrics

---

Gold Layer — Aggregation Patterns

When a user requests Gold analytics tables, guide LLM to generate:

Common aggregates: Daily/weekly/monthly summaries, dimensional analysis (by location, category, type), trend breakdowns over time, demand patterns (hour-of-day, day-of-week)
Spark session config — set these properties in the Gold notebook before any write operations:

  spark.conf.set("spark.sql.parquet.vorder.default", "true")
  spark.conf.set("spark.databricks.delta.optimizeWrite.enabled", "true")
  spark.conf.set("spark.databricks.delta.optimizeWrite.binSize", "1g")

V-Order (vorder.default) — applies Fabric's columnar sort optimization to all Parquet files, dramatically improving Direct Lake and SQL endpoint read performance
Optimize Write (optimizeWrite.enabled) — coalesces small partitions into optimally-sized files (target ~1 GB per binSize), reducing file count and improving scan efficiency
Optimization: ZORDER on filter columns, run OPTIMIZE after writes, pre-aggregate metrics to avoid runtime computation

---

End-to-End Execution Flow

When setting up medallion architecture end-to-end, the LLM must not stop after creating notebooks and deploying code. The complete lifecycle is:

Create Resources → Deploy Content → Bind Lakehouses → Execute → Verify Results

Step-by-Step

1. Create layer workspaces and lakehouses (default) — one workspace and one lakehouse per layer (Bronze, Silver, Gold); capture workspace IDs and lakehouse IDs 2. Create notebooks — one per layer, with valid .ipynb structure (see notebook-api-operations.md) 3. Bind lakehouse to each notebook — include metadata.dependencies.lakehouse in the .ipynb payload with:

default_lakehouse: the target lakehouse GUID
default_lakehouse_name: the lakehouse display name
default_lakehouse_workspace_id: the workspace GUID

4. Deploy notebook content — updateDefinition with the Base64-encoded .ipynb payload (content + lakehouse binding together) 5. Confirm deployment — check that each updateDefinition LRO returned Succeeded; that is sufficient. Do NOT call getDefinition to re-verify — it is an async LRO and adds significant latency per notebook. 6. Execute notebooks sequentially — use POST .../jobs/instances?jobType=RunNotebook:

Pass defaultLakehouse with both id and name in executionData.configuration
Run Bronze first → poll until Completed → run Silver → poll → run Gold → poll
Check for recent jobs before submitting (prevent duplicates — see SPARK-AUTHORING-CORE.md)

7. Verify results — after each notebook completes, confirm expected tables exist and row counts are reasonable 8. Connect Power BI to Gold — create semantic model + report on Gold summary tables (see Gold Layer → Power BI Consumption)

Common Failure: Stopping After Notebook Creation

If the flow stops after deploying notebook code without binding or executing:

Notebooks will have no lakehouse context → spark.sql() and relative paths (Tables/, Files/) fail at runtime
The user sees no output or results — the architecture is set up but never tested
Always complete through step 7 unless the user explicitly asks to stop at a specific step

---

Gold Layer → Power BI Consumption

After Gold tables are populated, connect Power BI to surface the analytics. Build a semantic model on top of the Gold lakehouse, using DirectLake.

Step-by-Step

1. Discover the Gold lakehouse SQL endpoint — call GET /v1/workspaces/{workspaceId}/lakehouses/{goldLakehouseId} and extract properties.sqlEndpointProperties.connectionString and provisioningStatus; wait until status is Success 2. Verify Gold tables via SQL — connect to the SQL endpoint using sqlcmd (see COMMON-CLI.md § SQL / TDS Data-Plane Access) and confirm the target table exists:

   SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'nyc_taxi_daily_summary'

3. Create a semantic model — use the semantic-model-authoring skill for semantic model creation and TMDL deployment. Create via POST /v1/workspaces/{workspaceId}/items with type: "SemanticModel" then deploy definition via updateDefinition using TMDL format (see ITEM-DEFINITIONS-CORE.md § SemanticModel):

The model must reference the Gold lakehouse SQL endpoint as its data source
Define a table mapping to the Gold summary table (e.g., nyc_taxi_daily_summary)
Use Direct Lake mode — this connects directly to Delta tables in OneLake without data import
Include measures for key aggregations you find interesting (e.g., Total Trips, Avg Fare, Total Revenue, Month over Month Growth)

4. Create a Power BI report — POST /v1/workspaces/{workspaceId}/items with type: "Report" then deploy definition via updateDefinition using PBIR format (see ITEM-DEFINITIONS-CORE.md § Report):

Reference the semantic model created in step 3 via definition.pbir
Define at least one page with visuals on the Gold summary table
Suggested visuals: line chart (daily trend), card (KPI totals), bar chart (by category), table (detail view)

5. Verify end-to-end — use the semantic-model-consumption skill to run DAX queries against the semantic model and confirm data flows from Gold tables through to the report

Principles

Discover SQL endpoint dynamically — the connection string is in properties.sqlEndpointProperties.connectionString on the lakehouse response; never hardcode it
Wait for SQL endpoint provisioning — status must be Success before connecting; newly created lakehouses may take minutes to provision
Prefer Direct Lake mode — avoids data duplication; semantic model reads directly from OneLake Delta tables
Match table/column names exactly — the semantic model table definition must use the exact Delta table and column names from the Gold lakehouse
For semantic model authoring (TMDL, refresh, permissions), cross-reference the semantic-model-authoring skill
For DAX query validation, cross-reference the semantic-model-consumption skill

---

Pipeline Orchestration

When a user requests a pipeline for the medallion flow, guide LLM to design with:

Structure: Sequential activities (Bronze → Silver → Gold), each waiting for previous success; independent Gold aggregations can run in parallel; include validation and notification activities
Parameterization: Pipeline-level processing date (defaults to yesterday), passed to all notebooks; dynamic date expressions
Scheduling: Daily aligned with source refresh, watermark-based incremental processing, periodic full refresh for corrections
Error handling: Retry with backoff for transient failures, alerting for persistent failures, graceful degradation (downstream uses previous data if upstream fails)

---

Environment Optimization

For detailed Spark configurations and optimization strategies, see [data-engineering-patterns.md](../spark-authoring-cli/resources/data-engineering-patterns.md).

Layer	Profile	Key Settings
Bronze	Write-heavy	Disable V-Order, enable autoCompact, large file targets, partition by ingestion_date
Silver	Balanced	Enable V-Order, adaptive query execution, partition by business date, ZORDER on filtered columns
Gold	Read-heavy	V-Order (`spark.sql.parquet.vorder.default=true`), Optimize Write (`optimizeWrite.enabled=true`, `binSize=1g`), vectorized readers, adaptive execution, ZORDER on all filter columns, pre-aggregate metrics

---

Examples

Example 1: Set Up Medallion Workspaces (Default)

Prompt: "Set up medallion architecture with separate Bronze, Silver, and Gold workspaces for sales analytics"

What the LLM should generate: REST API calls to: 1. Create workspaces: sales-bronze-dev, sales-silver-dev, sales-gold-dev 2. Create one lakehouse in each workspace: sales_bronze, sales_silver, sales_gold 3. Assign RBAC roles per workspace/layer

# Workspace creation (see COMMON-CLI.md for full patterns)
cat > /tmp/body.json << 'EOF'
{"displayName": "sales-analytics-dev"}
EOF
workspace_id=$(az rest --method post --resource "https://api.fabric.microsoft.com" \
  --url "https://api.fabric.microsoft.com/v1/workspaces" \
  --body @/tmp/body.json --query "id" --output tsv)

# Create Bronze lakehouse
cat > /tmp/body.json << 'EOF'
{"displayName": "sales_bronze", "type": "Lakehouse"}
EOF
az rest --method post --resource "https://api.fabric.microsoft.com" \
  --url "https://api.fabric.microsoft.com/v1/workspaces/$workspace_id/items" \
  --body @/tmp/body.json

Example 2: Design Bronze Ingestion

Prompt: "Ingest daily CSV files into bronze lakehouse with metadata columns"

What the LLM should generate: PySpark notebook that: 1. Reads source files with schema inference or explicit schema 2. Adds ingestion_timestamp, source_file, batch_id columns 3. Writes to Delta table partitioned by ingestion date 4. Logs row count and validation metrics

# Bronze ingestion pattern (guide LLM to generate full implementation)
from pyspark.sql.functions import current_timestamp, input_file_name, lit
import uuid

batch_id = str(uuid.uuid4())
df = (spark.read.format("csv").option("header", True).load("/Files/landing/daily/")
      .withColumn("ingestion_timestamp", current_timestamp())
      .withColumn("source_file", input_file_name())
      .withColumn("batch_id", lit(batch_id)))
df.write.mode("append").partitionBy("ingestion_date").format("delta").saveAsTable("bronze.events_raw")

Example 3: Bronze-to-Silver Transformation

Prompt: "Clean bronze data: remove duplicates, filter invalid records, add derived columns, write to silver"

What the LLM should generate: PySpark notebook applying quality rules, schema conformance, and partitioned write with optimization.

Example 4: End-to-End Pipeline

Prompt: "Create a pipeline that runs bronze ingestion, then silver transformation, then gold aggregation daily at 2 AM"

What the LLM should generate: Pipeline JSON definition with sequential notebook activities, date parameter, retry logic, and schedule trigger.

Related skills

Azure DeploySafely execute production deployments of already-prepared applications to Microsoft Azure.478k1.3k

Azure ValidateRun deep pre-deployment checks on Azure configuration, infrastructure definitions, RBAC roles, and managed identities before pushing to production.477k1.3k

Github Actions DocsGet precise, docs-grounded answers about GitHub Actions workflows, syntax, security, and migration instead of relying on stale knowledge.275k72

Setup Pre CommitAutomatically run Prettier, type checking, and tests on every commit via Husky and lint-staged.161k188k

Deploy To VercelSafely turn any local project into a live Vercel preview with one instruction.97.8k29.5k

Vercel Cli With TokensDeploy projects to Vercel from agents and scripts using token authentication instead of interactive browser login.73.4k29.5k

FAQ

What does e2e-medallion-architecture do?

When should I use e2e-medallion-architecture?

Invoke when >.

Is e2e-medallion-architecture safe to install?

Review the Security Audits panel on this page before installing in production.

DevOps & CI/CDdeploy

About

E2e Medallion Architecture by the numbers

e2e-medallion-architecture capabilities & compatibility

What e2e-medallion-architecture says it does

Add your badge

How do I apply e2e-medallion-architecture using the workflow in its SKILL.md?

Who is it for?

When should I use this skill?

What you get

Files

End-to-End Medallion Architecture

Prerequisite Knowledge

Architecture Overview

Must/Prefer/Avoid

MUST DO

PREFER

AVOID

Workspace Setup Guidance

Option A: Schema-Enabled Lakehouse (Preferred)

Option B: Separate Lakehouses (Legacy)

Common Steps (Both Options)

Explicit Override: Single Workspace

Bronze Layer — Ingestion Patterns

Silver Layer — Transformation Patterns

Gold Layer — Aggregation Patterns

End-to-End Execution Flow

Step-by-Step

Common Failure: Stopping After Notebook Creation

Gold Layer → Power BI Consumption

Step-by-Step

Principles

Pipeline Orchestration

Environment Optimization

Examples

Example 1: Set Up Medallion Workspaces (Default)

Example 2: Design Bronze Ingestion

Example 3: Bronze-to-Silver Transformation

Example 4: End-to-End Pipeline

Related skills

FAQ

What does e2e-medallion-architecture do?

When should I use e2e-medallion-architecture?

Is e2e-medallion-architecture safe to install?

This week in AI coding