Concept & Data Drift

(reference : https://towardsdatascience.com/machine-learning-in-production-why-you-should-care-about-data-and-concept-drift-d96d0bc907fb)


Problem : corrupted, late, or incomplete data

\(\rightarrow\) solved…then is it OK? that’s not all !!


Contents

  1. Model decay
  2. Data drift
  3. Concept drift
  4. How to deal with drift?


1. Model decay

Past performance is no guarantee of future results

= model drift / model decay / staleness


Reason :

  • 1) data drift
  • 2) concept drift


Retraining might help!

figure2


2. Data Drift

( Data drift = feature drift = population/covariate shift )

input data has changed!

  • old model might not be suitable for new data!


Example 1) online advertising

  • task : want to predict how likely they will make a purchase

  • feature distribution of “source channel” might change over time!

figure2


Example 2) Demographic change

  • people get old over time!

figure2


“Degree of decay” depends on the task!


Training-serving skew

Cause of skew is different from data drift!

( actually, there is no “drift”, but more like “mismatch” )


Example )

  • TRAIN on artificially constructed or cleaned dataset
  • INFERENCE on real-world dataset

figure2


3. Concept Drift

patterns ( relation of X & Y ) the model learned changes

a) gradual concept drift

figure2

follows the gradual changes in “external factors”

examples )

  • competitor launches new products
  • macroeconomic conditions change

individual change might be small, but big as a whole


b) sudden concept drift

figure2

example) COVID-19

  • shopping patterns changed suddenly!

    \(\rightarrow\) change in “demand forecast”


4. How to deal with drift?

Need to “RETRAIN” the model!

  • method 1) Retrain the model using all available data
  • method 2) Retrain the model using all available data + higher weight on new data
  • method 3) Retrain the model using NEW data


other options

  • domain adaptations
  • building a composition of models taht use BOTH old & new data
  • entirely new architecture


Categories:

Updated: