( reference : Machine Learning Data Lifecycle in Production )

Introduction to Machine Learning Engineering in Production

[1] Overview

  • ML enginerring for PRODUCTION
  • Production ML = (1) + (2)
    • (1) ML development
    • (2) software development
  • Challenges in production ML


Traditional ML vs Producton ML

Main difference :

  • production ML requires much more than just a modeling code!!
  • data is NOT STATIC in production ML!!


[ Traditional ML ]

figure2


[ Production ML ]

figure2


figure2


Manage the entire life cycle of data

  • labeling
    • is it properly labeled?
  • feature space coverage
    • do they always have the same feature space?
  • minimal dimensionality
    • reduce the dimension of feature to optimize performance
  • maximum predictive data
    • does the data have predictive information?


Production ML system

figure2

figure2

  • continuosly moniter the model performance, ingest new data, retrain when needed, redeploy to maintain / improve the performance


Challenges in production grade ML

  • have to build an INTEGRATED ML system
  • need to CONTINUOSLY operate it in production
  • handle CONTINUOSLY CHANGING DATA
  • optimimze compute resource costs


[2] ML Pipelines

Outline

  • ML Pipelines
  • DAG (Directed Acyclic Graphs) & Pipeline Orchestration Frameworks
  • TFX ( Tensorflow Extended )


ML pipeline

figure2


DAG ( Directed Acyclic Graphs )

  • directed graphs with NO cycles
  • ML pipeline workflows : usually DAGs
    • sqeuencing of tasks
    • have relationships/dependencies with each other

figure2


Pipeline Orchestration Frameworks

  • GOAL : schedule components in ML pipelines
  • make pipeline automation
  • ex) Airflow, Argo, Celery, Luigi, Kubeflow

figure2


TFX ( Tensorflow Extended, TFX )

  • end-to-end platform,

    for deploying production ML pipelines

figure2


Categories:

Updated: