MLOpsAdvanced5h

Monitoring models.

Tracking accuracy, latency, and inputs in production.

What is model monitoring?

Model monitoring watches a deployed model's behavior — its latency, its inputs, and, where possible, its accuracy — to catch problems after deployment. Unlike regular software, a model can silently get worse as the world changes, without throwing a single error.

Why it matters

A model that was accurate at launch can quietly decay as real-world data drifts away from its training data. Without monitoring, you find out from angry users, not your dashboards. Monitoring is what makes a deployed model trustworthy over time, and it is a defining MLOps responsibility.

What to learn

  • Operational metrics: latency, throughput, errors
  • Input monitoring and detecting out-of-range data
  • Prediction distribution monitoring
  • Measuring accuracy when labels arrive late
  • Alerting on degradation
  • Logging predictions for later analysis
  • Closing the loop back to retraining

Common pitfall

Monitoring only the service (uptime, latency) and not the model (input drift, prediction quality). The API can be perfectly healthy while the model's predictions have quietly become garbage. Monitor the model's inputs and outputs, not just whether the server responds, because silent quality decay is the failure unique to ML.

Resources

Primary (free):

Practice

For your served model, log every prediction with its inputs, and add metrics for latency and the distribution of predictions. Simulate input drift by sending data unlike the training set and confirm your monitoring would flag it. Done when you would detect quality decay before users complained.

Outcomes

  • Track both operational and model-quality signals.
  • Detect input drift and shifting prediction distributions.
  • Measure accuracy when labels arrive with a delay.
  • Alert on degradation and feed it back to retraining.
Back to AI / ML roadmap