Dictionary

MLOps

MLOps is the way of working that brings machine learning models reliably into production and keeps them healthy there. It combines elements of DevOps with the quirks of data and models that degrade over time.

What is MLOps?

MLOps stands for Machine Learning Operations. It is the combined practice, tooling, and roles needed to move machine learning models from prototype to reliable production and to keep them healthy after that. MLOps is to ML what DevOps is to software: automation, observability, version control, and collaboration, but with a number of extras that ML brings to the table.

An ML solution is different from classic software in three ways:

  • Its behaviour depends not only on the code, but also on the data it was trained on. Run the same code on different data and you get a different model.

  • A model degrades on its own. The world changes, the data shifts, and performance drops without you doing anything. That is model drift.

  • Failures are often harder to see. A buggy API crashes, a bad model simply answers plausibly wrong.

Compare MLOps to running a restaurant rather than building a house. A house is finished and stands. A restaurant has to be restocked every day, with quality control on every plate.

Why MLOps?

Without MLOps, a lot of AI work never reaches production, or gets stuck once it does. Common symptoms:

Notebook projects no one else can run
Training happens on one data scientist's laptop, on data they have locally, in an environment nobody documented. A colleague cannot reproduce it, let alone deploy it.

Models that quietly stall
Six months in, the model is making worse predictions than at launch, but nobody notices until the business spots it in a quarterly result.

No trace of what is in production
Which version, trained on which data, with which hyperparameters, in which registry, on which day? None of the answers available without a long search.

MLOps tackles every one of these with discipline and automation.

Building blocks of an MLOps platform

Version control
Code, data, and models are all versioned. Git for code, DVC or Delta for datasets, a model registry for the models themselves.

Reproducible training pipeline
One command or pipeline run starts the same training on the same data with the same parameters. Tools like MLflow, Azure Machine Learning, Databricks MLflow, Kubeflow, and Metaflow support this.

Model registry
A central place where approved models live with their metadata: version, metrics, owner, stage (dev, staging, production). MLflow Model Registry, Azure ML Registry, or Vertex AI Model Registry are common choices.

Continuous integration and deployment
A new model version runs through automated tests: quality metrics on a test set, bias checks, performance benchmarks. Only on success does it roll out to production, often first as shadow deployment or A/B test.

Monitoring
Input data and output predictions are monitored live for drift. A shifted distribution triggers an alert or even automatic retraining.

Feedback loops
Actual outcomes (fraud did or did not happen, customer did or did not churn) flow back into the system. Only then can real performance be measured, not just test-set accuracy.

Roles in MLOps

MLOps is a team sport. Four roles usually appear:

  • Data engineers: build the pipelines that make features available, both at training and at inference time.

  • Data scientists: experiment, train, and validate. They build the model.

  • ML engineers: take experiments and pour them into production pipelines. The bridge between notebook and platform.

  • MLOps engineers / platform team: build and run the platform itself so everyone else can work without infrastructure toil.

Smaller organisations collapse these into fewer roles, but the tasks remain.

MLOps versus DevOps versus DataOps

DevOps focuses on the lifecycle of application code. Tests, deployments, service monitoring.

DataOps focuses on the lifecycle of data pipelines. Data lineage, data quality, orchestration.

MLOps combines both and adds the model-specific parts: experiments, feature stores, drift monitoring, retraining.

In mature organisations these disciplines converge in one integrated data and AI platform. Microsoft Fabric, for instance, offers adjacent capabilities for data engineering, analytics, and model deployment, and wires into Azure Machine Learning for true MLOps features.

Pitfalls

Pilots that never reach production
Many AI projects stop at a dashboard of proud test accuracies. Start early with the production question: where does this run, who uses it, what are the SLAs? Without answers, every project stays a prototype.

A platform too complex for the team
A mature MLOps stack quickly reaches twenty tools. Small teams drown. Start minimal: a model registry, a CI pipeline, basic monitoring. Expand when the pain appears.

Drift alerts nobody reads
Monitoring without an owner is just decoration. Define explicitly who reviews drift alerts and which action follows.

Data and model privacy bolted on late
Personal data, ethics, and AI Act duties belong baked into the pipeline, not added afterwards. A good MLOps platform documents automatically which data was used and why.

Retraining as the only answer to drift
Not every drift calls for retraining. Sometimes the right move is a redesign, retirement of the model, or adding human oversight. MLOps automates, but does not remove the thinking.

Last Updated: April 23, 2026 Back to Dictionary
Keywords
mlops machine learning operations ai model drift monitoring ci cd devops azure ml databricks mlflow model registry governance