( reference : Introduction to Machine Learning in Production )

Selecting and Training a Model

[1] Modeling Overview

Will learn how to improve ML models

( will focus on the modeling part of the full cycle of a ML project )

Contents

1) how to select and train the model
2) how to perform error analysis,

Two types of AI development

1) model-centric AI development
2) data-centric AI development.

has been a lot of emphasis on how to choose the “right model” ( = 1) )

but sometimes, more useful to take a more **data-centric approach ( = 2) ) **

\(\rightarrow\) feed the model “HIGH-quality data”

( not just more and more data, but improve data efficiently )

[2] Key Challenges

Key Challenges while building ML models

Key Framework :

“AI systems = code + data” + ( hyperparameters )

code = algorithm/model

( until now… has been lots of emphasis on “code/model”, but not on “data” )

For many applications, have the flexibility to change the data better way!

( sometimes, will be more efficient to spend a lot of time improving the data )

Iterative Process

model development is an iterative process
step 1) start with model + hyperparameter + data
step 2) train model
step 3) error analysis
- help decide how to improve model/hyperparameters/data
- back to step 1)

Challenges of model development?

have to do well on “TRAINING dataset”
have to do well on “DEV/TEST dataset”
have to do well on “business metric & project goals”

\(\rightarrow\) often overlooked ( good 2 \(\neq\) good 3)

\(\rightarrow\) disagreements between ML team & business team

[3] Why low average error isn’t good enough

Not enough to do just well on dev/test dataset!

\(\rightarrow\) then… what to do more to make project successful??

Challenges ( additional to concept/data drift )

may have low test set error, but when performance on a set of disproportionately important examples isn’t good enough …

\(\rightarrow\) will still not be acceptable for production deployment.
performance on key slices of the data set
rare cases

Ex for challenge 1 ) Web Search

2 types of queries

1) informational and transactional queries
- should return the most relevant results
- users = forgive wrong ranking
2) navigational queries
- user has a very clear intent
- users = unforgive wrong results(urls)

Navigational queries are a disproportionately important set of examples

BUT, average test set accuracy weight all examples equally
\(\rightarrow\) solution : higher weight on the latter …. but just changing the weights of different examples doesn’t always solve the entire problem

Ex for challenge 2 ) ML for loan approval

ML algorithm for loan approval

\(\rightarrow\) should not unfairly discriminate with gender,ethnicity,language.. ( = protected attributes )

Even if well on test set…

will be unacceptable if it shows a bias/discrimination with those protected attributes!

Ex for challenge 2 ) Product Recommendations from retailers

only recommend HOT-products?? what about minor products…?

\(\rightarrow\) treat fairly all major users/retailers/product categories

Ex3 for challenge 3) skewed data distributions

Medical diagnosis

most of patients do not have a certain disease ( ex. 99.9% )
good performance, by just choosing “NO DISEASE”…? no!

Summary

not just to do well on the test set,

but to produce a ML system that meets business goals!

[4] Establish a baseline

first thing to do : “establish a baseline”

then : use tools to efficiently improve on that baseline level!

Example ) speech recognition

\(\rightarrow\) does this mean that we have to focus on “low bandwidth audio”? NO!

should “establish a baseline level” first!

ex) HLP (Human Level Performance)
- by asking some human transcriptionists to label data & measure accuracy

\(\rightarrow\) Helps us decide where to focus on!

HLP (Human Level Performance)

less useful baseline for structured data applications

Ways to establish a baseline

HLP (Human Level Performance)
Literature search for SOTA/open source
Quick-and-dirty implementation
Performance of older system

[5] Tips for getting started

ML is an “iterative process”

[Tip 1] start with a quick literature search!

start with a quick literature search to see what’s possible
if your goal is “building a practical production system” ( not research )

\(\rightarrow\) don’t obsess about finding the latest, greatest algorithm!!

\(\rightarrow\) rather, find something reasonable that lets you get started quickly!!

( ex. open source implementation )
GOOD data + SOSO algorithm > SOSO data + GOOD algorithm

[Tip 2] should we consider deployment constraints?

YES…
- if the baseline is already established &
- confident that this project will work &
- thus the goal is to build and deploy a system.
NO…
- if you have not yet even established a baseline
- or unsure it will work

[Tip 3] quick sanity checks

before running it on ALL HUGE data, check with FEW data first!
advantage :
- lets you find bugs much more quickly!
- if not good on “small” dataset,
  
  then probably also not good on “big” dataset also!

Twitter Facebook LinkedIn

[Week 2-1] Selecting and Training a Model

Seunghan Lee