Data Dictionary

Unsupervised learning

What is unsupervised learning?

Unsupervised learning is a form of machine learning where a computer spots patterns in data on its own, without anyone telling it what is right or wrong. You hand the system a pile of data and let it work out the structure for itself.

Imagine you have a thousand photos of animals, but no labels. The computer cannot know that one is a dog and another is a sparrow. What it can do is notice which photos look alike. It might end up with a group of "four-legged things" and a group of "things with wings", without ever learning the words for them.

That is the heart of it: self-directed learning. Where supervised learning trains on examples paired with the correct answer ("this is a cat, this is a dog"), unsupervised learning works without that guidance. The model has to figure things out on its own.

How unsupervised learning works

The logic underneath is mathematical and statistical. The system looks for similarities, distances, or relationships between data points. Instead of predicting answers, it tries to organise or summarise the data.

The most common technique is clustering, where the model groups data points that resemble each other. Think of customers who tend to buy the same products, or machines that show similar energy patterns. Another important idea is dimensionality reduction, which boils data with hundreds of variables down to a handful of underlying factors. That makes complex information easier to read and analyse.

There is also association learning, which hunts for relationships between items. Retailers use it to discover that customers who buy pasta often pick up tomato sauce in the same trip.

The principle holds across the board: the model never knows what is correct, it only learns what tends to go together.

Unsupervised learning works with almost any kind of data, as long as that data can be measured. Numerical data is the most common, things like revenue, ages, or website clicks. But text data, like customer reviews or support emails, can also be processed to surface themes. With image data, the model can group similar photos or products automatically. Even time series, such as sensor readings or seasonal sales, can be analysed for hidden patterns or anomalies.

The rule of thumb: as long as you can turn the data into numbers that can be compared, unsupervised learning has something to work with.

When do you use unsupervised learning?

You reach for unsupervised learning when you have plenty of data but no predefined categories or known answers. The goal is not to predict, it is to discover. It often shows up in the exploratory phase of an analysis, when you want to understand which groups or trends actually exist before you commit to a model.

In a business setting that often means customer segmentation, churn analysis, inventory optimisation, or feedback analysis. It surfaces fresh insight without you needing to know in advance what you are looking for. Companies use it to group customers by buying behaviour, to spot fraud through unusual patterns, or to sort piles of free-text feedback into themes automatically.

It is also useful as a preparation step for other forms of machine learning. By simplifying or grouping data first, later models can train more efficiently and produce sharper results.

Examples of unsupervised learning in business

Inside companies, unsupervised learning has surprisingly broad reach. Marketing teams use it to split customers into segments. That lets them tailor promotions or newsletters to different audiences without anyone having to define those audiences by hand.

In e-commerce it powers product recommendations. By analysing buying behaviour, the system discovers which products are often bought together, which gives you the familiar "customers who bought this also bought..." suggestions.

Risk management uses it too. In finance or accounting, the system can flag transactions that look unusual compared to the rest, which often points to fraud or input errors. In manufacturing, it analyses sensor data to catch drifting machine performance early, before it turns into downtime.

Even HR teams find a use for it. By grouping employees on skills or performance signals, companies can target training more precisely or build teams with complementary profiles.

Strengths and limits of unsupervised learning

The biggest strength is that you do not need labelled data. That saves a huge amount of preparation work. It can also reveal structure or relationships people would never spot manually. It works across many data types and makes a strong starting point for further analysis.

The downside is that the results are hard to grade. There is no "correct" answer, so you cannot always tell whether the patterns the model found are genuinely useful. Sometimes it surfaces coincidences that mean nothing. Interpretation also takes experience and domain knowledge, and small changes to the data or settings can produce quite different outcomes.

Unsupervised learning surfaces patterns no one would spot by hand, but a human still needs to judge what actually matters.

Last Updated: April 18, 2026 Back to Dictionary

Keywords

unsupervised learning machine learning supervised learning reinforcement learning clustering dimensionality reduction customer segmentation anomaly detection ai artificial intelligence

Unsupervised learning

What is unsupervised learning?

How unsupervised learning works

When do you use unsupervised learning?

Examples of unsupervised learning in business

Strengths and limits of unsupervised learning

Keywords

Related Terms

Activity (process mining)

Agent memory

Agent sandbox (AI)

Agent skill

Agentic AI

From the blog.

New Data Panda connectors in June 2026

How to enable the Power BI Desktop Bridge (and the CLIs the docs leave out)