What are the steps in the MLOps system, process, and/or lifecycle? The answer depends on whom is describing it. Startups define it one way, Google and Microsoft another way, and so on. Every industry leader describes the steps in the MLOps process in their own way, as shown here:
- Datarobot refers to the process as Augmented Intelligence and it includes the following: prepare, build, deploy, predict, monitor, optimize, and continuous AI.
- Algorithmia describes it as ML development, deployment, and operations.
- H2O explains the steps as prepare, model, explain, operate, develop, and consume.
- Seldon illustrates the lifecycle as 1) Training: data prep, parameterized experiments, selection, trained model; 2) Deployment: managed rollouts, predictions with feedback, inference graphs, request logging, and 3) Monitoring: accuracy, data drift, outliers, and explanations.
- Microsoft defines it visually as 1) creating models with reusable ML pipelines; 2) automating MLOps rollout; 3) automatically creating audit trail; 4) deploying and monitoring performance, and 5) observing data drift.
- Google defines it as an ML system that has thirteen parts (for now) and each part performs a certain activity. The parts are configuration, automation, data collection, data verification, feature engineering, testing and debugging, ML code, resource management, model analysis, process management, metadata management, serving infrastructure, and monitoring.
- During a presentation, Corey Zumar, a software engineer with Databricks broke down the ML lifecycle into four steps: 1) data prep; 2) training; 3) deployment, and 4) raw data.
The truth is the best MLOps solution for a given requirement depends on the use case. And, as shown above, the common elements are data prep, building, training, deploying, and monitoring.
After reviewing different point-of-views, we’ve come up with our own: MLOps is the end-to-end process of building and deploying ML models. It’s simple and only contains two parts: build and deploy.
Of course, each part has a number of steps that are required for completing a given task. One of the most critical parts of any ML project is building a pipeline that is responsible for pushing tasks forward from one step to the next. This activity is known as DAG, directed acyclic graph.
To simplify this even further, we can put data prep under build and monitoring under deploy so that we have two primary parts for MLOps.
What are all the steps in the build phase? A few core activities include data preparation, feature engineering, data cleansing, algorithm selection, ML framework selection, model building, and hyperparameter training (experimentation). Once again, this list of activities will likely change from vendor to vendor.
The build phase seems to present a host of challenges to data scientists and ML engineers. According to an Algorithmia report, “55% of the companies [surveyed] have not deployed an ML model.” However, once these challenges are resolved, more issues await during the deployment phase. The challenges are both processes and product-oriented. When it comes to products, engineers must select the most appropriate combination of open-source tools that work best for a given requirement.
Fortunately, there has been profound innovation in the development of ML tools, many are native to Kubernetes. Kubernetes native tools are nice to work with because Kubernetes is the de facto orchestration platform that has won the hearts and minds of the global community. On the other hand, Kubernetes is a beast, and ML engineers must understand the inner workings of a platform that has hundreds of features and plugins and is fairly complex.
In summary, what are the steps in MLOps? It depends.