Azure MLOps is a game-changer for businesses looking to streamline their machine learning operations. With Azure MLOps, you can automate the deployment and management of your machine learning models.
By using Azure Machine Learning, you can create, manage, and deploy models at scale. This includes automating tasks such as data preparation, model training, and model deployment.
Azure MLOps also provides a centralized platform for tracking and monitoring your models, allowing you to quickly identify issues and optimize performance. This helps to reduce the time and resources required to manage your machine learning operations.
By automating the deployment and management of your machine learning models, you can focus on higher-level tasks such as strategy and innovation.
Prerequisites
Before we dive into Azure MLOps, let's make sure you have the necessary prerequisites.
You'll need an Azure subscription, which you can create for free if you don't already have one. A free account will get you started with the free or paid version of Azure Machine Learning.
To work with Azure Machine Learning, you'll also need an Azure Machine Learning workspace. This will serve as the central hub for your machine learning projects.
In addition to Azure Machine Learning, you'll need Git running on your local machine. Specifically, you'll need Git version 2.27 or newer, which you can download from the official Git website.
If you're planning to use Azure DevOps, you'll need to set up an organization and create a project to host your source repositories and pipelines. This will help you manage your code and automate your workflows.
Finally, if you're using Azure DevOps with Terraform to spin up infrastructure, you'll need to install the Terraform extension.
Broaden your view: Terraform Ci/cd Pipeline Azure Devops
Setup and Configuration
To set up Azure DevOps, you'll need to follow these steps. Navigate to Azure DevOps and select create a new project, naming it mlopsv2 for this tutorial.
You'll need to create a service connection to Azure Resource Manager. In the project under Project Settings, select Service Connections and then Create Service Connection.
Broaden your view: Project Azure
To verify the service connection, select Azure Resource Manager, then Next, and choose Service principal (manual). Select Next again and choose the Scope Level Subscription.
Name the service connection Azure-ARM-Prod and select Grant access permission to all pipelines, then Verify and Save.
Once you've completed these steps, you can set up a source repository with Azure DevOps. Open the project you created and select the Repos section.
To import the repository, enter the Clone URL field with https://github.com/Azure/mlops-v2-ado-demo and select import at the bottom of the page.
Next, open the Project settings and select Repositories. Select the repository you created and select the Security tab.
Under the User permissions section, select the mlopsv2 Build Service user and change the permission Contribute permission to Allow and the Create branch permission to Allow.
Now, open the Pipelines section and select Manage Security. Select the mlopsv2 Build Service account for your project under the Users section and change the permission Edit build pipeline to Allow.
Here are the steps to set up Azure DevOps in a concise list:
- Navigate to Azure DevOps and select create a new project.
- Create a service connection to Azure Resource Manager.
- Import the repository with the Clone URL field.
- Set up user permissions for the mlopsv2 Build Service user.
- Manage security for the mlopsv2 Build Service account.
Infrastructure and Deployment
You can deploy infrastructure via Azure DevOps, which involves deploying the training pipeline to the Azure Machine Learning workspace created earlier.
To deploy the infrastructure, you need to run the Azure infrastructure pipeline, which involves updating the config-infra-prod.yml file with your preferred namespace, postfix, and location.
The pipeline creates the following artifacts, including the infrastructure for your MLOps project.
Here are the steps to deploy the infrastructure:
1. Go to your repository, mlops-v2-ado-demo, and select the config-infra-prod.yml file.
2. Update the namespace, postfix, and location to your liking.
3. Select Commit and push code to get these values into the pipeline.
4. Go to Pipelines section and select Create Pipeline.
5. Select Azure Repos Git and select the repository that you cloned in from the previous section.
6. Select Existing Azure Pipelines YAML file and select the main branch and choose mlops/devops-pipelines/cli-ado-deploy-infra.yml.
7. Run the pipeline, which will take a few minutes to finish.
Once the pipeline is complete, you'll have the infrastructure for your MLOps project deployed.
For more insights, see: Azure vs Azure Devops
Deploying Infrastructure
Deploying infrastructure is a crucial step in setting up your MLOps project. You'll need to deploy the training pipeline to the Azure Machine Learning workspace created in the previous steps.
Make sure you understand the Architectural Patterns of the solution accelerator before proceeding. This will ensure a smooth deployment process.
To deploy the infrastructure, you'll need to go to your repository, select the config-infra-prod.yml file, and update the namespace, postfix, and location values to your liking. The namespace should be 5 random letters, the postfix should be 4 random digits, and the location should be eastus.
If you're running a Deep Learning workload, ensure your GPU compute is available in your deployment zone. This is important for tasks like CV or NLP.
To create the pipeline, go to the Pipelines section, select Create Pipeline, and choose Azure Repos Git as the source. Then, select the repository you cloned earlier and choose the existing Azure Pipelines YAML file.
The pipeline should create the following artifacts: infrastructure for your MLOps project. Note that you may see warnings about moving and reusing existing repositories, but these can be ignored.
Here's a summary of the steps:
- Update the config-infra-prod.yml file with your preferred values.
- Commit and push the code to get these values into the pipeline.
- Go to the Pipelines section and select Create Pipeline.
- Choose Azure Repos Git as the source and select the repository you cloned earlier.
- Choose the existing Azure Pipelines YAML file and select Continue.
- Run the pipeline and wait for it to finish.
Making Available: Deployment
To deploy a trained model, you can leverage the existing capabilities in Azure ML to create an endpoint and deploy one or more models to it.
The Azure endpoints functionality allows you to create an endpoint and deploy one or more models to it, making the model available for use.
You can use the Azure DevOps pipeline to deploy the trained model to the Azure Machine Learning workspace.
In Azure DevOps, you can create a new pipeline and select the repository that you cloned from the previous section mlopsv2.
You can then select the existing Azure Pipelines YAML file, specifically the deploy-model-training-pipeline.yml file, to deploy the trained model.
To test the deployment, you can go to the Endpoints tab in your AzureML workspace, select the endpoint, and click the Test Tab.
You can use the sample input data located in the cloned repo at /data/taxi-request.json to test the endpoint.
The inference logic is encapsulated in a Python file, usually called the “scoring script”, which serves as the link between the new data and the trained model.
The scoring script pre-processes the new, incoming data before it’s fed to the model.
Here are the steps to deploy the trained model:
- Create a new pipeline in Azure DevOps and select the repository that you cloned from the previous section mlopsv2.
- Select the existing Azure Pipelines YAML file, specifically the deploy-model-training-pipeline.yml file, to deploy the trained model.
- Go to the Endpoints tab in your AzureML workspace, select the endpoint, and click the Test Tab to test the deployment.
Training and Deployment
Training and deployment are crucial steps in the Azure MLOps process. You can use the solution accelerator to create a sample end-to-end machine learning pipeline that runs a linear regression to predict taxi fares in NYC.
The pipeline includes several components, each with its own function, which can be registered with the workspace, versioned, and reused with various inputs and outputs. This pipeline contains four main steps: data preparation, model training, model evaluation, and model registration.
Here's a breakdown of the pipeline's steps:
- Data preparation: This step takes multiple taxi datasets (yellow and green) and merges/filters the data, preparing the train/val and evaluation datasets.
- Model training: This step trains a Linear Regressor with the training set.
- Model evaluation: This step uses the trained model to predict taxi fares on the test set and compares the performance of the model with all previous deployed models on the new test dataset.
- Model registration: This step scores the model based on how accurate the predictions are in the test set and registers the model in Azure Machine Learning.
Once the pipeline is created, you can deploy it to Azure Machine Learning using Azure DevOps pipelines. This involves creating a new pipeline, selecting the repository, and choosing the YAML file for the deployment. After saving and running the pipeline, the infrastructure is configured, and the Prototyping Loop of the MLOps Architecture is deployed.
Training and Deployment Scenario
The training and deployment scenario is a crucial part of the machine learning pipeline.
A sample training pipeline for predicting taxi fares in NYC includes components for data preparation, model training, and model deployment. This pipeline is made up of multiple steps, each serving a different function.
Data preparation involves merging and filtering taxi datasets, as well as preparing train, validation, and evaluation datasets. The input for this step is local data under ./data/ (multiple .csv files), and the output is a single prepared dataset (.csv) and train/val/test datasets.
Model training involves training a Linear Regressor with the training set. The input for this step is the training dataset, and the output is a trained model (pickle format).
Model deployment involves using the trained model to predict taxi fares on the test set. The input for this step is the ML model and the test dataset, and the output is the performance of the model and a deploy flag whether to deploy or not.
A fresh viewpoint: Azure Data Studio vs Azure Data Explorer
Here are the steps involved in deploying a model training pipeline:
1. Go to ADO pipelines
2. Select New Pipeline
3. Select Azure Repos Git
4. Select the repository that you cloned in from the previous section mlopsv2
5. Select Existing Azure Pipelines YAML file
6. Select main as a branch and choose /mlops/devops-pipelines/deploy-model-training-pipeline.yml, then select Continue
7. Save and Run the pipeline
Once the pipeline is deployed, the infrastructure is configured, and the Prototyping Loop of the MLOps Architecture is deployed.
Automated Testing
Automated testing is a crucial step in ensuring the quality of your machine learning models. It involves creating test datasets that represent real-world scenarios, covering a range of input variations and edge cases.
To create effective test datasets, you should consider including a variety of input variations and edge cases. This will help you comprehensively validate model behavior and catch any potential issues early on.
Evaluation metrics, such as accuracy, precision, recall, and F1-score, should be defined to indicate the desired performance of your model. This will help you understand how well your model is performing and identify areas for improvement.
Automated testing involves running the entire model pipeline using test data and verifying the final predictions. This ensures that all components interact correctly and produce the expected outcomes.
By including automated testing steps within your CI/CD pipeline, you can ensure that new model versions are thoroughly tested before deployment. This helps prevent faulty models from being deployed and maintains the quality of deployed solutions.
Automated testing identifies bugs and discrepancies in model predictions, preventing faulty models from being deployed. It also enhances model transparency by validating predictions against known test cases.
Models that pass automated tests are more likely to perform well in production, leading to robust deployments and reduced risk of operational disruptions.
Operations and Management
In Azure MLOps, Operations and Management is a crucial aspect that ensures the smooth functioning of machine learning models in production. Azure Application Insights is a monitoring service used for detecting performance anomalies.
Azure Machine Learning provides a suite of tools for automating the end-to-end ML lifecycle, including creating reproducible ML pipelines, registering, packaging, and deploying models from anywhere. This allows for faster experimentation, development, and deployment of Azure machine learning models into production and quality assurance.
Here are some key MLOps capabilities provided by Azure Machine Learning:
- Create reproducible ML pipelines
- Create reusable software environments
- Register, package, and deploy models from anywhere
- Capture the governance data for the end-to-end ML lifecycle
- Notify and alert on events in the ML lifecycle
- Monitor ML applications for operational and ML-related issues
- Automate the end-to-end ML lifecycle with Azure Machine Learning and Azure Pipelines
Operations
Operations is a critical aspect of MLOps on Azure. It's the process of managing the machine learning lifecycle, from development to deployment.
With Azure Machine Learning, you can automate the end-to-end ML lifecycle using Azure Machine Learning and Azure Pipelines. This helps streamline the process and reduces the risk of errors.
Monitoring is also a key part of operations. Azure Application Insights is a monitoring service that helps detect performance anomalies. It's like having a watchful eye on your machine learning applications.
Here are some ways Azure helps with operations:
- Capture governance data for the end-to-end ML lifecycle
- Notify and alert on events in the ML lifecycle
- Monitor ML applications for operational and ML-related issues
By using Azure Machine Learning, you can create reproducible ML pipelines and reusable software environments. This makes it easier to manage and maintain your machine learning models.
Data Drift Detection
Data drift detection is a crucial aspect of ensuring your model remains valid over time. This involves checking for changes in the data distribution between the training dataset and new data.
If this caught your attention, see: Azure Data Studio Connect to Azure Sql
A periodic check for data drift can be performed using Azure ML's Dataset Monitors tool. However, for our implementation, we sidestepped updating the datasets by creating a new class instance each time we want to check for data drift.
The execution frequency of the data drift detection script can be customized, depending on the needs of the scenario. We opted for daily execution, as we expect new data to arrive on a daily basis.
The script retrieves the new data, updates the datasets for comparison, sends the data drift detection job to the compute, and gathers the metrics. Every time a data drift detection job is executed, we end up with metrics, graphs, and if data drift is detected, an email alert.
The main metric aims to summarize the extent of difference across datasets, regardless of the volume of columns or observations. This metric is easy to grasp at a glance, but the calculation method is not well-documented.
The other metrics include simple statistics for each feature, such as the minimum, maximum, mean, Euclidean distance, and the Wasserstein metric. These metrics can be used to retrain the model automatically if necessary, using the new data obtained after model deployment.
Retraining the model can only be done if we have access to the labels corresponding to the observations, the y-values. In our flash offer scenario, we would know at the end of the day whether the customers have accepted the offer or not, so we could obtain the labels.
Data drift detection is key in understanding how data distributions shift and identifying any issues warranting attention. While it's sufficient for numerous applications, there may be cases that need more advanced techniques such as data slicing or concept drift analysis.
Architecture and Design
In an Azure MLOps workflow, the architecture and design play a crucial role in ensuring seamless collaboration between various roles and the integration of different pipelines.
The end-to-end MLOps workflow involves several roles, including Data Engineers, Data Scientists, QA Engineers, and MLOps Engineers, each with distinct responsibilities. The workflow is designed to ensure clean and structured data is available for machine learning processes, with automated processes training models with the latest data and validated effectively.
The collaboration between these roles and the automation of processes are key to delivering reliable machine learning solutions. Here's a breakdown of the key roles involved in the workflow:
- Data Engineer: Responsible for managing the data pipeline and handling changes to datasets.
- Data Scientist: Focuses on data preprocessing, model training, and producing a champion model ready for deployment.
- QA Engineer: Conducts rigorous testing and validation of the model to ensure it meets business and technical requirements.
- MLOps Engineer: Oversees the entire lifecycle, ensuring automation, monitoring, and retraining workflows are in place for continuous improvements.
Python Architecture
The Python Architecture is built on top of Azure Pipelines, which break down pipelines into logical steps called tasks. These tasks are crucial for the build and release pipelines.
Azure Machine Learning is the heart of this architecture, using the Azure Machine Learning SDK for Python to create a workspace for experiments, compute resources, and more. This SDK is a powerful tool for machine learning engineers.
Azure Machine Learning Compute is a cluster of virtual machines where training jobs are executed. This is where the heavy lifting happens, and models are trained on large datasets.
Azure Machine Learning Pipelines provide reusable machine learning workflows. These workflows are published or updated at the end of the build phase and triggered on new data arrival.
Broaden your view: Azure Azure-common Python Module
In the Python Architecture, Azure Blob Storage is used to store logs from the scoring service. This is where input data and model predictions are collected.
The scoring Python script is packaged as a Docker image and versioned in Azure Container Registry. This registry is essential for tracking different versions of the script.
Azure Container Instances provide an easy and serverless way to run a container, mimicking the QA and staging environment as part of the release pipeline. This is a game-changer for developers who need to test their applications quickly.
Python Basics
Python is an object-oriented language that's easy to learn and use. It's great for beginners and experienced developers alike.
Indentation is crucial in Python, as it's used to define block-level structure. This means that indentation is not just a styling choice, but a necessary part of writing valid Python code.
Python has a vast collection of libraries and frameworks that make it ideal for web development, data analysis, and more. The popular library NumPy is used for efficient numerical computation.
Python's syntax is designed to be simple and readable, with a focus on whitespace to separate code blocks. This makes it easy to write and understand Python code.
Python is often used in data science and machine learning applications due to its ease of use and extensive libraries, including scikit-learn and TensorFlow.
Curious to learn more? Check out: Azure Code
The
In an end-to-end MLOps workflow, collaboration between various roles is crucial, including Data Engineers, Data Scientists, Training Pipeline, QA Engineers, Deploy Pipeline, MLOps Engineers, Software Engineers, and Infrastructure Engineers.
Each role has a specific responsibility, such as managing data pipelines, training models, conducting testing and validation, deploying models, and overseeing the entire lifecycle of machine learning models.
Data quality issues, infrastructure scalability, high implementation costs, lack of skilled talent, and aligning solutions with business objectives are common challenges involved in integrating AI and ML into company systems.
Machine learning can boost marketing campaign profits by predicting customer behavior, segmenting audiences, and personalizing offers, optimizing ad spend, identifying high-value customers, and enhancing targeting for maximum ROI.
Here are the key roles involved in an end-to-end MLOps workflow:
Data Engineer: Responsible for managing the data pipeline and handling changes to datasets.Data Scientist: Focuses on data preprocessing, model training, and producing a champion model ready for deployment.Training Pipeline: Automated processes ensure that models are trained with the latest data and validated effectively.QA Engineer: Conducts rigorous testing and validation of the model to ensure it meets business and technical requirements.Deploy Pipeline: Facilitates seamless model deployment to production environments while maintaining quality checks.MLOps Engineer: Oversees the entire lifecycle, ensuring automation, monitoring, and retraining workflows are in place for continuous improvements.Software Engineer: Integrates the ML API with applications, enabling users to leverage machine learning functionalities.Infrastructure Engineer: Ensures the application and models run smoothly on Kubernetes or other infrastructure platforms.
Azure DevOps plays a pivotal role in ensuring efficient version control within machine learning projects, providing a comprehensive version control system that empowers data teams to collaborate seamlessly.
Data Management
Data Management is a crucial aspect of any machine learning project. Transparency is crucial for maintaining data quality, understanding how data has been processed, and ensuring reproducibility.
Azure ML workspace keeps track of data lineage, allowing you to trace the origin and transformations applied to each dataset. This is essential for auditing and analysis.
The workspace provides built-in version control for datasets, enabling you to create and track different versions of datasets as they evolve over time. This ensures that you always have access to historical data.
Data profiling is also supported, allowing you to assess the quality of your datasets and identify missing values, outliers, and other data anomalies that might impact model performance. This helps you to make informed decisions about data quality and model accuracy.
Azure ML workspace supports collaborative data governance, allowing data scientists, data engineers, and business stakeholders to collaborate on data preparation, quality assurance, and compliance efforts. This promotes a culture of data transparency and accountability.
By integrating with services like Azure Data Factory and Microsoft Purview, Azure ML workspace empowers you to establish a robust data management framework. This framework ensures that your machine learning projects are fueled by high-quality, compliant, and well-organized data.
Frequently Asked Questions
What is the difference between Azure MLOps and DevOps?
Azure MLOps focuses on streamlining machine learning development, while DevOps focuses on web app and software development, integrating programming, testing, and deployment. Both aim to improve efficiency and collaboration, but with different goals and processes.
Sources
- https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-mlops-azureml
- https://k21academy.com/microsoft-azure/dp-100/machine-learning-operations-mlops-on-azure/
- https://www.projectpro.io/article/azure-mlops/691
- https://www.clearpeaks.com/end-to-end-mlops-in-azure/
- https://www.datumo.io/blog/mlops-on-azure
Featured Images: pexels.com