Dictionary

Process discovery

Process discovery is the step in process mining that automatically draws a process map from your event log. No workshops or interviews, just an algorithm that reads what your systems already record and shows how your process really runs in practice.

What is process discovery?

Process discovery draws a process map for you, without anyone having to put the process on paper first. An algorithm reads the event log from your systems and reconstructs how an order, invoice or request really moves through your organisation.

It is the first of three pillars of process mining. The other two are conformance checking (does practice match the design?) and enhancement (what can you learn from the data to improve?). Discovery is where you start: without a map of reality, comparing or improving has little to build on.

How does it work in practice?

The input is always an event log. That is a table with three mandatory columns: a case ID (the order number or invoice number that ties events together), an activity ("order created", "invoice approved", "payment received") and a timestamp. Every row in the table is one event.

The tool groups all events per case and counts how often each path from step A to step B appears. An algorithm decides which paths belong to the main process and which are exceptions, and draws the result as a map with nodes (activities) and arrows (transitions).

The output looks like a BPMN diagram, with numbers on top: how many cases went through each arrow, what the average throughput time is between two steps, where cases get stuck. You can zoom in or out to show only the dominant paths, or just the rare deviations.

Which algorithms do tools use?

Under the hood there are a handful of algorithms that each handle messy data differently.

The alpha miner is the classic algorithm, published by Wil van der Aalst and colleagues in IEEE TKDE in 2004. It derives causal relationships from the order of activities. Elegant, but fragile: with incomplete data you get an unreadable spaghetti diagram. Today it is mostly a teaching starting point.

The heuristics miner followed shortly after, in the early 2000s, and addressed that weakness. It counts frequencies and uses thresholds to ignore rare paths. You set how strict: only paths covering at least 10% of cases, or down to 1%. That makes it usable on real event logs with duplicate entries and forgotten steps.

The inductive miner is the newer generation. It splits the event log recursively and builds a model from blocks (sequence, choice, parallel, loop). The output is always a valid process model, even on messy data, and exports easily to BPMN. Tools like Celonis, Apromore, Fluxicon Disco and Microsoft Power Automate Process Mining lean on variants from this family today.

Process discovery versus a flowchart on paper

Classic process analysis starts with a workshop. People who know the process draw on a whiteboard how it is supposed to run, and you turn that into a BPMN diagram. That map is neat, logical and almost never what really happens.

Process discovery does not start from opinions but from logs your systems already keep. An order in SAP leaves an event, and so do approval, delivery and payment. Tie them together through the order number and the real map comes out on its own.

The difference shows up where theory and practice split. In a drawn order-to-cash process, five steps line up neatly in order. In the discovered map you see that some orders go back for a correction, some skip delivery confirmation and jump straight to invoicing, and there are paths nobody drew on the whiteboard. That is where time and money leak.

What do you need to get started?

For a first project:

  • A bounded process. Not your whole organisation at once. Pick something that happens often and lives in a single system: order-to-cash in your ERP, or ticket handling in your service desk.

  • An event log with at least three columns. Case ID, activity and timestamp. Extra fields like actor or amount make the analysis richer but are not mandatory.

  • A tool. For smaller businesses, Power Automate Process Mining is accessible because it sits inside Microsoft 365. Celonis, Apromore and Fluxicon Disco are established alternatives. ProM is open source and used mostly for experimental projects.

  • A sharp question. "Where do orders get stuck between approval and delivery?" gets you a more useful answer than "show me our process".

The biggest challenge is in the data, not the algorithm. Joining events from different sources, picking the right case ID and aligning time zones takes more time than the discovery step itself. Start with one system and expand only once the first map stands.

Last Updated: April 23, 2026 Back to Dictionary
Keywords
process discovery process mining event log bpmn workflow engine process analysis alpha miner heuristics miner inductive miner order-to-cash process map