Dictionary

Data lineage

Data lineage shows the full journey data takes inside an organisation. From the original source to the final report, with meaning and context attached. It is what makes people trust the numbers they see on a dashboard.

What is data lineage?

Data lineage is the map of the journey data takes inside an organisation. You can see which source it came from, the steps it went through along the way, and where it finally landed. That journey can run from an operational system all the way to a dashboard or a report.

Think of it as a route description. Departure and arrival both matter, but so do all the stops in between. Without that overview it gets very hard to understand why the numbers say what they say.

In most organisations, data grows organically. First there is an accounting package, then a CRM, later a pile of Excel files, scripts and dashboards on top. Every new piece adds complexity, and at some point nobody is sure which source is the right one or why a number suddenly shifted last week. That is where data lineage comes in.

Why is data lineage important?

Data lineage builds trust in data. Users understand where the numbers come from. Discussions get sharper, and decisions land on firmer ground.

On top of that, data lineage matters for:

  • Troubleshooting and root-cause analysis

  • Impact analysis when something is about to change

  • Audits and regulatory reporting

  • Knowledge sharing inside teams

Without lineage, your data depends entirely on the people who built it. With lineage, that knowledge belongs to the team.

How does data lineage work?

Data lineage follows your data through the layers it passes:

  • Source systems like ERP, CRM or external files

  • Processing through ETL or ELT pipelines

  • Storage in a data warehouse or database

  • Consumption in reports, dashboards and analyses

That flow can be captured at different levels of detail, from a high-level diagram down to individual columns. The right level of detail depends on the goal and the audience you are documenting it for.

Technical data lineage

What is it?

Technical data lineage describes how data physically flows through systems. It focuses on tables, columns, views and code. The question it answers is: how is this data moved and transformed?

What gets captured?

  • Source tables and fields

  • Transformations in SQL or ETL tools

  • Relationships between layers in the data warehouse

  • Dependencies between datasets and reports

Who is it for?

Technical lineage is mainly useful for data engineers and BI developers. They lean on it during troubleshooting, impact analysis and day-to-day maintenance.

How is it kept up?

Technical lineage is often built up automatically by tooling. That works well, but still needs human review. Complex transformations cannot always be parsed correctly by a scanner.

Functional data lineage

What is it?

Functional data lineage describes data from a business angle. It focuses on meaning, definitions and use. The question it answers is: what does this number actually represent?

What gets captured?

  • Definitions of KPIs and metrics

  • Business rules and filters

  • Exceptions and agreed conventions

  • How the numbers feed into decisions

Who is it for?

Functional lineage is aimed at business users, management and data stewards. It raises understanding and pushes everyone to use the same numbers in the same way.

How is it kept up?

Functional lineage is usually captured by hand, through documentation, data catalogs and conversations with the people who own the metric. Automation only goes so far. Alignment between teams is what really keeps it accurate.

The difference and how they fit together

Technical and functional lineage complement each other. One shows how data flows, the other shows what data means. Without the technical layer you lack control. Without the functional layer you lack context.

A good approach connects both. A business definition links straight back to a technical source, and one coherent story emerges from the two.

Tools and applications for data lineage

Several types of tools live in this space:

Technical lineage tools

These focus on automatic detection of data flows by scanning code and metadata. They are strong on detail, weaker on meaning.

Data catalogs with lineage

These combine metadata, business definitions and lineage in one place. They are friendlier for business users and they support governance work directly.

Open-source options

Flexible and budget-friendly, but they ask more from your team in technical knowledge and ongoing maintenance.

Manual approaches

Diagrams, wikis and shared documents still pull their weight. Especially for smaller setups, or as a starting point before you invest in tooling.

In practice, a mix of the above is usually the most realistic path.

Best practices for data lineage

  • Start from a clear purpose

  • Begin small and focus on the data that really matters

  • Combine technical and functional lineage

  • Document at the right level of detail

  • Use consistent terminology across the board

  • Automate where it actually helps

  • Keep everything current

Simplicity and discipline beat completeness every time.

Keeping data lineage alive in your organisation

Make it part of the work

Wire data lineage into existing processes. Update it whenever a new report goes live or when something changes. That way it becomes routine rather than a one-off project.

Work with owners

Every dataset and every KPI needs a clear owner. Without ownership, lineage goes stale fast.

Limit the scope

You do not need to document everything. Focus on the data that drives decisions or that gets shared with people outside the team.

Combine tooling with conversation

Tools show you the structure. Conversations create understanding. You need both.

Make it visible and useful

Use lineage actively when questions or change requests come in. The parts that get used are the parts that survive.

Worked example

A Belgian SME runs a revenue dashboard. The definition has shifted over time. Without lineage, the discussions go in circles every quarter.

With data lineage in place it is clear:

  • Which source the number is built on

  • Which transformations it has been through

  • What revenue actually means in this context

Changes happen under control, and trust in the dashboard grows.

Our take

Data lineage does not have to be perfect. It has to work. Better simple and supported by the team than complex and forgotten in a folder. Data lineage is not a document, it is a habit.

Last Updated: April 18, 2026 Back to Dictionary
Keywords
data lineage data governance metadata data flow ETL ELT data documentation data catalog impact analysis data quality