MLflow
MLflow is an open-source machine learning lifecycle platform that manages the end-to-end process of testing and deploying models. It’s the first of it kind that is open source and it rivals Facebook FBLearnerย and Uberย Michaelangelo.ย
In the early days, Facebook encountered several challenges in the ML process, from experimentation to testing and deployment. It was a very complex process that required highly skilled engineers and data scientists to conduct experiments. And when the same data was fed into the same models, the results would differ.ย
FBLearner solved these problems.ย Today, non-data scientists can conduct thousands of experiments and reproduce the exact same results when the same dataset and tools are used. However, this tool belongs to Facebook. Databricks developed MLflow as an open-source alternative that eases the process of testing, experimentation, training, and deployment.ย
ย
Project Background
- Tool:ย MLflow
- Author: Databricksย
- Initial Release: June 2018
- Type: ML lifecycle platformย
- License: Apache-2.0ย
- Supports: Multiple frameworks and tools: TensorFlow, PyTorch, XGBoost, etc.ย
- Github:ย mlflow
- Runs On: Cloud and on-prem
- Hardware: GPUs and CPUs
- Twitter:ย MLflow
Features
- Track and manageย experiments
- Reproducible runs
- Light-weight API
- CI/CD workflow integration
- Artifact store
- Versioning
Components
- MLflow Tracking: API to log data from experiments that can be used to compare them via UI
- MLflow Projects: Package to reproduce runs using Docker and Conda
- MLflow Models: Package that lets users deploy same model o different platform
- MLflow Model Registry: Set of APIs to help manage full lifecycle