Dictionary

Medallion architecture

Medallion architecture organises a lakehouse into three layers: bronze for raw data, silver for cleaned data, and gold for business-ready tables. Everyone knows which state the data is in and what reports are allowed to build on.

What is medallion architecture?

Medallion architecture is a way to organise the data in a lakehouse into three clear layers: bronze, silver, and gold. Each layer has a different quality bar, a different audience, and a different purpose. Data flows from layer to layer and at every step becomes cleaner, more structured, and closer to what the business actually needs.

The term was popularised by Databricks around 2020 and has since become the default pattern in almost every lakehouse implementation, including Microsoft Fabric. It is not new technology. Classic data warehouses have been using staging, cleansed, and presentation layers for years. Medallion is the same pattern, renamed for the realities of a lake.

Think of the three layers as three stations in a kitchen. Bronze is the back door delivery: raw ingredients as they just came in. Silver is the prep: washed, chopped, ready to cook. Gold is the plate going out to the dining room: presentable, portioned, and finished.

What sits in each layer?

Bronze layer (raw)
An exact copy of the source data, as close to the original as possible. You store it as it arrives, with minimal transformation: at most a timestamp, a source identifier, and a load_date column. Bronze is your audit trail and your restart point. If something goes wrong in silver or gold, you can always replay from here.

Silver layer (cleansed, conformed)
Here the data gets cleaned, standardised, and combined. Duplicates removed, data types normalised, invalid records isolated, keys matched across sources. Silver is the version data engineers and data scientists work with: reliable, but not yet curated per business question.

Gold layer (business-ready)
Aggregations, dimensional models, and use-case-specific datasets. This is where the tables that feed your reports and dashboards live. A gold layer can hold several variants: a finance gold, a sales gold, an operations gold, each with its own semantics and grain.

Why three layers?

Clarity about quality
Reports rely on gold, experiments on silver, recovery on bronze. Everyone in the organisation knows which data they have in hand.

Repeatability
If a transformation turns out to be wrong, you can restart from the previous layer without having to fetch the source data again.

Separation of concerns
Bronze is a data engineering problem. Silver is a data engineering and data quality problem. Gold is often a collaboration with business analysts. The layers create natural handoff points.

Cost control
You can keep what still changes on hot storage and move stable data to colder tiers. Bronze sometimes lives on colder storage while gold sits on fast compute.

Implementing medallion in Fabric

In Microsoft Fabric, a medallion architecture emerges quite naturally:

  1. Bronze in a lakehouse. Data Factory pipelines land raw files or tables in a bronze lakehouse, formatted as Delta over Parquet.

  2. Silver through notebooks or dataflows. Spark notebooks or Dataflows Gen2 read bronze, apply cleansing rules, and write to a silver lakehouse.

  3. Gold as a lakehouse or warehouse. For BI, a Fabric Data Warehouse on top of silver is often handy because SQL performance and RLS work out of the box.

  4. Power BI on gold. The Power BI semantic model only rests on the gold layer, never directly on bronze or silver.

Every layer lives on the same OneLake storage, without copies. The separation is in workspaces, schemas, and folders, not in different physical systems.

Pitfalls

Layers for the sake of layers
Three is a guideline, not a commandment. For some use cases bronze plus gold is enough, for others you add a platinum layer for heavily aggregated KPIs or an extra silver step for enrichment. Let the architecture serve the use case, not the other way around.

Gold as a dumping ground
Every team dropping its own tables into gold without governance turns it into a new data swamp within a year. Keep gold curated and documented, preferably with data lineage in place.

Weak bronze discipline
A bronze layer that gets overwritten occasionally is not an audit trail. Always add a load timestamp and keep every batch versioned with Delta. Otherwise you lose the restartability advantage.

Forgetting access separation
Silver holds unclean data and sometimes personal data that is not meant for everyone. Set different permissions on bronze and silver than on gold, and document why.

Last Updated: April 23, 2026 Back to Dictionary
Keywords
medallion architecture bronze silver gold lakehouse microsoft fabric onelake databricks delta lake data engineering etl data warehouse