Tracking Machine Learning Experiments with MLflow

Organizations that embark on using management software for MLOps for the first time may find it to be a challenging task. With countless open-source tools available, selecting the best one can become overwhelming. To help address this issue, OSSInsight.io created ranking lists based on metrics such as stars, pull requests, and issues. According to the number of stars, the top 5 most popular MLOps tools for 2023 are Jina, MLflow, NNI, Kubeflow, and Label Studio.

In this article, we will review the Machine Learning (ML) lifecycle and how MLflow fits into this history. For further information on other MLOps tools, please refer to our blog.

Machine Learning Lifecycle

Let’s start with the basics. Machine Learning is a subfield of Artificial Intelligence (AI) that involves building algorithms or models that can leverage large datasets to make predictions without explicit programming. The simplest categorization method designates ML models that employ regression and classification techniques as Supervised Learning, while those that use clustering and dimensionality reduction techniques as Unsupervised Learning.

Machine Learning modeling consists of two main stages: Training and Prediction. During the training stage, an algorithm or code uses input and output data relevant to the developer’s goals to build a predictive model. Once built, the model’s performance is evaluated to determine its accuracy through predictions with known data.

However, the brief explanation above is based on the assumption of only one “iteration” and the model compliance with the technical specifications. In a typical developer’s daily work, ML modeling encompasses intricate tasks that force developers to undergo multiple trial-and-error tasks. This set of experiments leads to various models and datasets must be tested to guarantee maximum efficiency of the model, making the process challenging and often requiring users to employ spreadsheets to track parameters and metrics.

As you noted, developing an ML project is a complex process. It covers various stages referred to as the Machine Learning Lifecycle, including problem definition, data collection and manipulation, engineering analysis, training, evaluation, deployment, A/B testing, and monitoring. It is why Devs employ MLOps tools like MLflow to manage an ML project.

Introducing MLflow

MLflow is an open-source platform built to manage the entire Machine Learning Lifecycle. It was designed to simplify the experimentation, reproducibility, implementation, and management stages through its four modules:

MLflow Tracking: Track the results of an ML model by logging parameters and metrics. It also provides a UI for easy visualization of those results.
MLflow Projects: Package ML code in different formats and ensure reproducibility of runs across platforms.
MLflow Models: Support a standard format for deploying and managing.
MLflow Model Registry: Provide a centralized storage tool for lifecycle management.

MLflow is designed to be used with any machine learning library and can be integrated into a wide range of environments, whether running on-premises or in the cloud.MLflow operates independently of any specific machine learning library and can be utilized with any programming language through its REST API and CLI. Additionally, for ease of use, MLflow offers Python, R, and Java APIs.

Why Should You Use It?

MLflow eliminates the need for manual tracking of information through MLflow eliminates the need for manual tracking of information through multiple spreadsheets during the model development phase. With just one line of code, mlflow.tensorflow.autolog(), MLflow can automatically log data. Other commands, such as .log_param() and .log_metrics(), enable users to choose specific parameters and metrics.

The MLflow UI allows developers to visualize various logged data values, such as learning rate, number of epochs, validation accuracy, and more. MLflow updates the UI with additional rows as you conduct multiple runs of an ML model, allowing you to analyze and compare results. The UI provides filtering options based on specific criteria (e.g., log_loss>0.08) and enables users to sort experiments based on a particular parameter or metric.

Since its initial release, MLflow has integrated with various platforms, including Conda, Google Cloud, Kubernetes, Docker, TensorFlow, PyTorch, and others, demonstrating its stability and ability to scale as the ML project matures.

Quick Installation Guide

This section will guide you on installing and importing MLflow into your code. This tutorial uses the method for Jupyter Notebook. If you use a different IDE, refer to the official documentation. You can easily install MLflow using the following command.

!pip install mlflow

To know the version, you need to run:

!mlflow --version

Once you have installed MLflow, the next step is to launch the UI. To access the dashboard and compare experiments, simply run “mlflow ui” in the command prompt, which will start the server at an URL such as “http://127.0.0.1:5000/”. A new window will appear displaying information such as Start Time, User, Source, Parameters, Metrics, and more. To view additional details, click on the experiment’s name.

!mlflow ui

Now, the goal is to perform experiments in the form of model training and tracking the runs in the MLFlow UI.

What Should We Expect in the Coming Years?

As more businesses adopt AI technologies, we can expect significant advancements and a simultaneous increase in data usage. We will see datasets becoming complex and constituted from different sources and types.

From a workflow automation perspective, developers will continue to build new tools to streamline machine learning processes. Meanwhile, tools such as MLflow will continue to evolve and become more user-friendly, making them accessible to a broader range of users.

According to an article published on InsideBIGDATA, Moses Guttmann, CEO and Co-Founder of ClearML, believes that “MLOps is here to stay and finding MLOps success depends on more than just grabbing the newest, shiniest solution off the shelf. By keeping these few best practices in mind, businesses can build the foundations they need for sustainable MLOps growth.”

Overall, the coming years will see continued growth and development in data processing and machine learning, focusing on improving AI systems’ reliability, transparency, and ethics.