Machine Learning

Your Model Is Not Made: Understanding and Adjusting the Driving Model

You have put your model into production.

Predicts and assigns participants.

The pipeline is automated.

Now it's time to go back and relax, your work is done.

I also like to dream.

Okay, back to reality. Let's discuss model drift: what it is, why it happens, how it's detected, and how to deal with it before it destroys performance and stakeholder trust in the model along with it.

What is Model Drift?

Model driving is the degradation of predictive model performance over time, and even the most powerful, accurate models are vulnerable to it. Model driving is not a reflection of poor training methods or bad data collection, but something that all data scientists should keep an eye on.

Image via VectorElements on Unsplash

Let's look at an example. A binary classification model is trained on two years of historical data. Performance is good, AUC is in the low 0.9s, precision and recall are both high enough. The model passes the peer review stage and makes it to the production site. Here, it starts making predictions live. After 90 days, the data scientist queries the predictions the model made in production and runs them through a validation script that calculates performance metrics. The performance is right in line with the expectations from the POC (proof of concept), and it is communicated to the stakeholders: “The model works as expected. The predictions are accurate.”

Fast forward two years. A request to investigate the model comes in. It is reported that it consistently makes wrong predictions, and participants no longer trust the model. There is also talk of using their old Excel spreadsheet method if things continue this way. A data scientist queries data for the past 6 months and runs it through a validation script. The data scientist rubs his eyes, checks their notes, and is shocked. AUC sits at 0.6, precision and recall are both very low. “How can this happen? I trained a top model. I even certified this model after it went live! What happened?” data science questions. Model drift is what happened. It crept in, went unnoticed for months and wreaked havoc on the forecast.

This is the harsh reality that many predictive models face in production. Let's discuss why it happens.

Why Does Model Drift Happen?

Boiled down, model drift occurs because models live in the real world. The model was trained on one reality, and that reality has changed somewhat as it has been transferred to production.

One of the most common causes of model drift is a change in the way data is recorded. When the data was initially collected for training, the predictors and targets looked one way, and now, they are different. The algorithm learned a certain relationship between them, but now, that relationship has changed. The model has not yet learned how to handle new relationships, so it continues to make better predictions than it was trained on.

The drift model generally falls into two categories:

Data Drift (features change)

Concept Drift (relationship change/people change)

Let's look at some examples.

Example #1: Data Drift

Height and weight are used to predict diabetes risk. A data scientist collected two years of patient data, making sure to draw each patient's height in centimeters, weight in kilograms, and whether that patient eventually developed diabetes a year after the measurement. Two years later, a new measurement process requires nurses to record height in centimeters and weight in kilograms and the model starts making inaccurate predictions because of it. For example, a patient who is 6 feet tall used to have a height reading of 72 inches, but now has a height reading of 183 inches. This patient weighed 200 pounds, now listed as 91 pounds. The model does not know what transformations need to occur to account for the change in units. It expects to be assigned features in the units it was trained on, so it predicts that a person is 183 inches (more than 15 meters) tall, and 91 pounds. No, this prediction is absurd!

Example #2: Concept Drift

A readmission risk model was developed for the hospital system by their team of data scientists. Three years after going live, their program found four major hospitals in the neighboring region. These hospitals have different patient populations, which are very different from the original population the model was trained on. When the model is introduced to new hospitals, providers realize that it makes many false positive and false negative predictions. The model must be retrained to include data from these new hospitals.

How to Find and Fix Model Drift

Model driving can be gradual, with performance decreasing gradually over a long period of time, or it can be rapid, with performance dropping suddenly and clearly. This dynamic nature can make it difficult to prepare and even more difficult to identify without the right tools.

Photo by the author

Monitoring performance in production regularly is the best way to detect model drift.

If you don't monitor your model in production, you won't see the drift until the stakeholders do.

A quick dashboard or notebook that can be run every few weeks can be an easy way to see the performance of the model and catch any deterioration over time. Simply plot precision, recall, AUC, MAE, MSE, or any other relevant performance metrics for your model on the y-axis, and date on the x-axis. What you should expect is a small variation from week to week, but a large deviation from the average signal has changed, and drift is likely to occur. The missing factor and factor distribution plot can help you drill down into individual forecasts, helping you determine the cause of the drift. This may look like the number of NA or NULL values ​​for each feature over time, or the average value for each feature over time.

I actually caught model drift on one of my models using the above method. I noticed a drop in accuracy on my Difficult IV Access model. After a few weeks of accuracy rates being lower than expected, I became suspicious. My supervisor suggested looking at the lack of a feature as a possible cause. Also, the third most important factor, the history of undernourishment, had a huge increase in NULL values ​​this very week my model performance started to deteriorate. We discovered that the SQL driving feature creation in production had some fixes, and joins were not behaving as intended. We updated the SQL and the accuracy returned to normal levels from that day on.

Photo by Sayyam Abbasi on Unsplash

This brings me to my final point: how to fix model drift. There are several methods of drift correction, each suitable for different situations. As you saw above, one way to correct for drift is to prepare the input/data in the same format that was previously available to train the model. This is an easy, quick way to fix drift, and should be a default if possible. This can be done anywhere in the data loading process, from the ETL database, to the underlying annotation code where predictions are made. If the length is recorded in centimeters, and your model expects it to be in inches, the conversion can be done before the prediction.

Sometimes, however, the data cannot be changed. Maybe the data management defined the data point formally, and now the units are set, and those units are different than the ones your model was trained on. Or, the workflow prevents data from being loaded in the same format. Another solution, although it requires more effort, is to retrain the model. Retraining the model on new data allows it to relearn the relationships between the variables, establishing a model that works reliably on the new data provided. Changes in the population likely require retraining of the model.

Wrapping up

Model drift can sneak up on any unwary data scientist. Let it go on long enough and it can destroy performance and user confidence. But, it is not something to be afraid of. With the right tools, finding drift is possible, and fixing it is attainable. Being able to see when model drift occurs, and having the knowledge to identify the cause and determine a fix is ​​what separates data scientists who are just happy to get a model into production, from those who know how to build a model that can have a lasting impact.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button