Dictionary

ETL / ELT

ETL and ELT stand for Extract, Transform, Load and Extract, Load, Transform. They are two ways of moving data from source systems into a central analytics environment. The order of transformation and loading decides which pattern you use and which tools fit.

What are ETL and ELT?

ETL and ELT are two patterns for moving data from source systems to a central analytics environment. Both consist of the same three steps, but in a different order.

ETL stands for Extract, Transform, Load. You extract data from the sources, transform it on a staging server, and load the end result into the data warehouse. Transformation happens before the data lands in the warehouse.

ELT stands for Extract, Load, Transform. You extract data from the sources, load it into the warehouse or lakehouse as-is, and transform it inside the target environment. Transformation happens after the data has landed.

It looks like a detail, but the difference shapes the tools you choose, performance, governance, and how quickly a team can ship new reports.

How is ELT different from ETL?

ETL is the older of the two. When warehouses ran on expensive on-premises servers, loading all raw data into them was unaffordable. A separate ETL server handled transformations so the warehouse only ever saw clean, aggregated data.

ELT became popular when cloud warehouses like Snowflake, BigQuery, and Microsoft Fabric made storage and compute cheap enough. Suddenly you could land terabytes of raw data without worrying and decide what to do with it later. That brings a few big advantages:

  • Reprocessability
    When logic changes, you rewrite the transformation. You don't have to re-extract from the source.

  • Single source of truth
    Raw and transformed data live in the same place, under the same security, queryable with the same tools.

  • Flexibility for data science
    ML teams often need access to raw data. With ETL it's gone. With ELT it's ready to use.

  • SQL as the transformation language
    In ELT you transform inside the database or lakehouse. Tools like dbt make that highly productive, and SQL skills are much more widely available than ETL-specific languages.

ELT also has downsides. You sometimes load data you never use, which costs money. And if governance is weak, analysts each build their own transformations, each with a slightly different customer view.

When do you use ETL?

  1. Sensitive data that must not persist in raw form
    Medical, legal, or personal data that needs masking or pseudonymisation before landing in an analytics environment.

  2. Limited target environment
    When the warehouse isn't powerful enough to handle heavy transformations (typical for older on-premises setups).

  3. Data minimisation compliance
    GDPR or sector rules may require that you only store strictly necessary fields in your analytics environment.

  4. Legacy systems and mainframes
    When the source is a mainframe with slow exports, you sometimes want to transform during extraction to cut total turnaround time.

When do you use ELT?

  1. Modern cloud platforms
    Anyone building a new data platform on Microsoft Fabric, Snowflake, BigQuery, or Databricks today starts from ELT almost by default.

  2. Data science and AI
    Models need raw historical data to learn patterns. ELT keeps it available without a second pipeline.

  3. Fast iteration
    Where business definitions change frequently, editing a SQL transformation is faster than rewriting an ETL job.

  4. Self-service analytics
    Analysts want to build their own gold tables on top of a silver layer. With ELT and a tool like dbt that pattern is built in.

Tools for ETL and ELT

The line between ETL and ELT tools is blurring. Many modern tools support both patterns.

Microsoft Fabric Data Factory
Visual pipelines and dataflows for both ingestion and transformations. Hundreds of connectors to cloud and on-premises sources. A strong option in a Microsoft ecosystem.

Azure Data Factory
The standalone version outside Fabric, still common in existing Azure projects. Integrates with SSIS for migration scenarios from on-premises SQL Server.

dbt (data build tool)
The most popular ELT tool for transformations inside the warehouse or lakehouse. You write models in SQL with version control, tests, and documentation. Runs on Snowflake, BigQuery, Databricks, Fabric, and more.

Power Query
Inside Power BI and Excel, and also available in Fabric Dataflows. Ideal for business users who want to clean and transform their own data without code.

Fivetran, Airbyte, Matillion
Specialists in extraction and loading. Often paired with dbt for the transformation layer.

Apache Airflow, Dagster, Prefect
Orchestration tools that schedule, monitor, and retry ETL or ELT pipelines. Code-first, ideal for complex dependencies.

SSIS (SQL Server Integration Services)
Classic ETL tool in Microsoft environments. Still common in existing landscapes, rarely the first choice for new projects.

Common pitfalls

Schema drift
Source systems quietly change their schemas (new fields, renames, type changes). Without detection, pipelines only break on the next run. Automatic schema detection and alerting belong in every pipeline.

Idempotency
A pipeline must be safe to rerun without producing duplicates or half results. That takes discipline around staging, upserts, and logging.

Backfill
When logic changes, you often need to reprocess history. Build pipelines with backfill in mind from day one or you will be surprised later.

No data lineage
Without data lineage, nobody knows which source feeds which column in which report. Every change becomes a spelunking expedition. Modern tools like dbt and Fabric generate lineage automatically.

Manual exports that linger
CSV dumps by email, Excel files on a shared drive. They outlive their intended lifespan by years and become the silent centre of fragile reporting. Remove them as soon as the official pipeline is in place.

Last Updated: April 18, 2026 Back to Dictionary
Keywords
etl elt data integration data factory dbt power query microsoft fabric data warehouse lakehouse data pipeline