Error analysis and performance auditing

[1] Error Analysis example

Error Analysis

Error Analysis is an iterative process

finding defects in smart phones

example of tags

example of tags

right most column : contribution to raising average accuracy

Which category to focus on?

After choosing which category to focus on….

Skip

even though well on accuracy/F1 score….

“performance audit” before pushing it to production!

\(\rightarrow\) might save you from significant post deployment problems

Double check your system!

( accuracy, fairness/bias, etc … )

step 1) brainstorm the ways the system might go wrong
- performance on “subsets of data”
  - ex) gender, age, ethnicity..
- how common are certain errors
  - ex) FP, FN
- performance on rare cases
step 2) establish metrics to assess the performance of those issues
- performance on slices of the data ( not on entire dev set )
- after establishing metrics… MLOps can help automatic evaluation!
  - ex) TFMA ( Tensorflow model analysis )
step 3) buy-in from the business of the product owner

Example ) speech recongition

step 1) brainstorm the ways the system might go wrong
- ex) accuracy on different genders/ethnicities
- ex) accuracy on different device
- ex) prevalence of rude mis-transcripition
  - GAN (generative adversarial network) \(\rightarrow\) gang? gun?
step 2) establish metrics to assess the performance of those issues
- ex) mean accuracy on different genders/ethnicities
- ex) mean accuracy on different device
- ex) checking rude words