MLOps(Machine Learning Operations)は、DevOps 的な practice を ML に適用し、ML lifecycle(data、training、deployment、monitoring)を reliable かつ scale で管理するものです。production ML の unique operational challenges に対応します。
ML lifecycle
text
ML projects は model training だけではなく full lifecycle を持つ:
1. DATA → collect, clean, label, version data
2. TRAINING → develop, train, evaluate models
3. DEPLOYMENT → model を production に置き inference を serve
4. MONITORING → production performance と issue を track
5. MAINTENANCE → data/performance の変化に応じて retrain/update
→ one-time effort ではなく ongoing cycle
MLOps が扱うもの
text
MLOps → ML lifecycle を reliable に管理する practices/tools:
✓ REPRODUCIBILITY → data, code, models を version し experiments を track
✓ AUTOMATION → training, testing, deployment pipelines (CI/CD for ML)
✓ DEPLOYMENT → model serving, scaling, versioning, rollback
✓ MONITORING → model performance, data DRIFT, errors を track。retrain timing を知る
✓ data scientists, ML engineers, ops の collaboration
