How do you maintain a deployed model?

Min Read

Once you've developed a machine learning (ML) model, you need consistent model monitoring to observe its performance over time. This will ensure consistency and accuracy. But what is a deployed model? A deployed model is a model that has been put in production to perform the task it’s trained on. Model deployment in data science is the first step in taking an ML model online after development and training offline or locally.

Using model monitoring tools, you can observe and analyze model performance throughout the ML lifecycle. Once a model is trained, how do you use it in real-world situations? Let’s talk about deployed models and measuring their performance.

How do you use a deployed model?

When a model is first deployed, its output is based on the training it has undergone and the input data it is consuming. You will want to compare challenger and champion models on performance and bias. It’s also important to monitor model drift, as your model features and output will change based on real-world operation versus the initial offline training. Ultimately, using a deployed model requires measuring performance, fairness, and accuracy.

How do you measure the performance of a model?

The performance of an ML model is measured by its effectiveness at making decisions in alignment with its purpose. Below are some ML model monitoring techniques that will help you measure performance:

Monitoring

Monitoring an ML model is about ensuring models performed as intended, with no model bias or drift in output. With the right tools, you should be able to inspect:

Performance of model behavior.
Drift of prediction power due to changes in inputs or outputs.
Outliers above or below the predetermined bounds.
Bias results due to flawed model training.
Errors in prediction, data, or low accuracy overall.

Analysis

By analyzing the root-cause performance issues, you’ll be able to slice and dice segments and understand the causes of any inaccuracy or unfairness. Understanding the big picture through analytics is essential for effective analysis of ML models. Typically, biases are not identified until after thorough data analysis or after examining the relationship between model predictions and model input.

How to maintain model accuracy over a time

As soon as your machine learning model is used in production, its performance starts to degrade. This is because model inputs vary over time, and your model is sensitive to changes in the real world. The algorithm is learning based on the new data going in and that learning and shift must be monitored.

Manual retraining

Retraining on demand is an investigative approach that can help improve performance. The steps of manually retraining are:

Analyze the development and operational datasets
Audit model performance
Select data for retraining
Engineer features
Build models
Test and evaluate models

While manual retraining can be effective, it can be quite cost-prohibitive due to the time it takes ML teams to perform the retraining, and models often require retraining before it’s manually performed.

Continuous training

The goal of continuous training is to regularly and automatically retrain the model in response to performance changes. Some methods of continuous training are:

Retraining based on intervals that periodically improve model performance if any model behavior changes meet the retraining requirements threshold.
Performance-based triggers that identify when models deteriorate in production, which can trigger retraining depending on the determined threshold.
Trigger based on data changes that identify when the input dataset fluctuates above or below a threshold.

Add dataset samples

Sometimes, an ML model lacks a sufficient number of datasets to accurately adjust predictions and perform the desired task after training. ML models need to be trained by example. Adding additional datasets to the model retraining will help it learn based on the current environment it will be operating in.

Cross-validation

Several ML models are trained on subsets of the input data and then evaluated on another complementary subset of data. In other words, in cross-validation, ML models are trained in specific data and then evaluated on comparable data that meets similar specifications. Cross-validation is used when the model fails to recognize a pattern.

Different algorithms

It’s possible that the algorithm you’ve selected isn’t the best solution to the problem you’re trying to solve. Not only that, but without extensive datasets, it might be difficult for the model to learn any complexities. Using different algorithms to uncover more details about data can help your model with better performance and prediction power. You may enhance the accuracy of your models by experimenting with different algorithms to see which ones perform best for your data.

Model deployment with Fiddler AI

Fiddler is a complete AI observability solution for the ML lifecycle. Data Science, MLOps, and line-of-business teams use Fiddler to monitor, explain, analyze, and improve their models. Try Fiddler for free today and see how Fiddler can help your team with their model deployment.