The Azure Model Catalog is a game-changer for developers and data scientists alike. It offers a centralized repository for language models, making it easier to discover, manage, and deploy models across various applications.
By providing a single source of truth for language models, the Azure Model Catalog streamlines the development process and reduces the time spent searching for models. This is especially useful for large-scale projects where multiple models are required.
With the Azure Model Catalog, users can easily browse and search for models based on specific criteria, such as model type, performance, and compatibility. This level of organization and accessibility is a significant improvement over traditional model management approaches.
Expanding the Ecosystem
You can work with open source models curated by Azure Machine Learning, which allows for a wide range of options to choose from.
The curated collections in Azure Machine Learning are frequently updated, with new additions added to the Model Catalog regularly.
This means you'll have access to a constantly growing library of models, making it easier to find the right one for your needs.
Open Source Models
Working with open source models can be a game-changer for your project.
Azure Machine Learning offers a curated collection of models that you can use in your workspace.
These models are stored in a managed storage account, which is accessible through a Service Endpoint Policy.
This setup allows you to access the models in a secure and controlled manner.
New curated collections are added frequently to the Model Catalog, so you can expect to see more options in the future.
The default outbound to the Microsoft Container Registry also makes it easy to deploy the models.
Language Models in Collections
Language models in collections like "Curated by Azure AI" require dynamic installation of dependencies at runtime.
To use these models, you'll need to add user-defined outbound rules for specific FQDNs at the workspace level. This includes adding rules for *.anaconda.org, *.anaconda.com, anaconda.com, pypi.org, *.pythonhosted.org, *.pytorch.org, and pytorch.org.
For instance, if you're using the "Curated by Azure AI" collection, you'll want to add rules for these FQDNs to ensure smooth operation.
Here are the FQDNs you'll need to add rules for:
- *.anaconda.org
- *.anaconda.com
- anaconda.com
- pypi.org
- *.pythonhosted.org
- *.pytorch.org
- pytorch.org
Virtual Network and Data
In Azure, virtual networks are used to connect and isolate resources, allowing you to control access to your data and applications.
Azure virtual networks support multiple subnets, enabling you to organize your resources into logical groups and assign IP addresses to each subnet.
With Azure virtual networks, you can also create network security groups to control inbound and outbound traffic, ensuring that only authorized traffic reaches your resources.
Network traffic can be monitored and analyzed using Azure Network Watcher, which provides insights into network performance and security.
Azure Data Lake Storage is a highly scalable and secure data storage solution that can be used to store and process large amounts of data.
Virtual Network for Internet Outbound
Setting up a virtual network to allow internet outbound is a great way to access the Model Catalog from within your workspace. You can configure a workspace with a managed virtual network to achieve this.
To do this, follow the steps outlined in the instructions. If you disable public network access to the workspace, you'll need to connect using one of the alternative methods.
With a managed virtual network that can access the internet, you'll have access to all the Collections in the Model Catalog. This is especially useful for troubleshooting and learning.
Here are the methods you can use to connect to the workspace if public network access is disabled:
- Learn how-to troubleshoot managed virtual network
Data Catalog
Data Catalog is a powerful tool that helps you find and understand data assets. Azure Data Catalog is an example of this, making data asset discovery straightforward.
It's a fully-managed cloud service that lets users register, enrich, discover, understand, and consume data sources. This means you can easily find and access the data you need.
Data Catalogs like Azure Data Catalog are especially useful for large organizations with many data sources. They help you keep track of all your data and make it easier for different teams to work together.
Frequently Asked Questions
What is an Azure catalog?
Azure Data Catalog is a centralized platform that helps discover and understand data assets across an organization. It's a cloud-based service that simplifies data discovery, registration, and consumption for all users.
What is a model catalog?
A model catalog is a centralized inventory of all models in an organization, using metadata to track and govern their usage. It provides visibility into model performance and usage, ensuring consistency and fairness across teams.
Does Azure have a data modeling tool?
Yes, Azure offers a data modeling tool that helps businesses structure and organize data in the cloud for better decision-making. This tool is designed to simplify data management and support informed business decisions.
Sources
- https://argonsys.com/microsoft-cloud/library/expanding-the-azure-ai-model-catalog-ecosystem/
- https://learn.microsoft.com/en-us/azure/machine-learning/how-to-network-isolation-model-catalog
- https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/expanding-the-azure-ai-model-catalog-ecosystem/ba-p/4147215
- https://academy.starburst.io/configure-an-azure-data-lake-storage-catalog
- https://dbmstools.com/categories/data-catalogs/azure-sql-database
Featured Images: pexels.com