
Deploying Azure OpenAI on Azure is a straightforward process that involves several key steps. The first step is to create a resource group, which is a logical container for your Azure resources.
To create a resource group, you'll need to choose a subscription and a location for your resources. Azure has data centers all over the world, so you can choose the one that's closest to your users.
Your resource group will also need a name, which should be unique and descriptive. This name will help you and others identify your resources in the Azure portal.
Azure OpenAI requires a dedicated instance of the Azure Cognitive Services, which you can deploy using the Azure portal or the Azure CLI.
Prerequisites
Before you begin, you need an active Azure subscription and access to the Azure OpenAI service, which you can request using this link. It typically takes just a couple of hours to gain access, but it can take up to 10 business days.
To get started, log into the Azure portal and select the Create a resource option. This will initiate the process of creating a resource.
Creating a Resource
To create an Azure OpenAI resource, you'll need to select the subscription where the new resources will be located.
You can create or select a resource group, but make sure to set the region correctly, as not every model type is available in every region. Currently, for access to gpt-4o models, you need to select East US 2.
Create a unique name for the resource, and choose the pricing tier. You can leave other values as default and use the Next button to proceed.
Creating a Resource
To create an Azure OpenAI resource, you'll need to select the subscription where the new resources will be located. This is the first step in the process.
You'll then need to create or select a resource group, which is a logical container for your resources. This is where you'll organize your Azure resources.
Next, you'll set the region for your resource, and it's essential to note that not every model type is available in every region. For example, to access gpt-4o models, you'll need to select East US 2.
You'll also need to create a unique name for your resource, which will help you identify it later on. This name should be descriptive and easy to remember.
After that, you'll choose the pricing tier for your resource, and you can leave other values as default. Once you've completed these steps, you can use the Next button to proceed with the resource creation.
The resource creation process can take a couple of minutes, and once it's complete, you'll be redirected to a summary page for your deployment. From there, you can select the Go to resource button to work with your new Azure OpenAI resource.
Transcriptions - Create
To create a transcription, you'll need to provide an audio file object, which is a required parameter. This can be a string representing the file.
The API version is also a required parameter, which you can specify in the query. You can choose from a variety of API versions to use with your deployment.
You'll need to provide a deployment ID for the whisper model, which is also a required parameter. This will help the model understand the context of the audio.
If you want to guide the model's style or continue a previous audio segment, you can provide an optional text prompt. This should match the audio language for best results.
Here are the parameters you can use to create a transcription:
The supported content types for the output are application/json and text/plain, which will return the transcribed text in a specific format.
Translations - Create
To create a translation, you'll need to provide an audio file to translate. This file is required, so make sure it's in the correct format.
The audio file should be a string, which means it's a text input. The prompt field is optional, but if you use it, keep in mind it should be in English.
The response format is also optional, but if you choose to specify it, you'll need to use the audioResponseFormat type.
You can adjust the temperature of the model to make the output more or less random. The default temperature is 0, which means the model will automatically adjust it until certain thresholds are hit.
Here's a quick rundown of the required and optional fields for creating a translation:
Deploying from Azure
To deploy an Azure OpenAI model, you can start by opening Azure OpenAI Studio, where you can choose to deploy a base model or a fine-tuned model.
You can deploy from the Azure OpenAI Studio by clicking on "Deploy model" and selecting one of the options: Deploy base model or Deploy fine-tuned model.
If you select Deploy base model, you'll see a list of available OpenAI models in your region, along with a description of each model in the right panel.
To configure the deployment, you'll need to fill in the fields, including Deployment Name, Model Version, Deployment Type, Tokens per Minute Rate Limit, and Content Filter.
You can also deploy an Azure OpenAI model from the AI Foundry portal model catalog, where you can select a model such as gpt-4o-mini and deploy it to a real-time endpoint.
To deploy from the AI Foundry portal model catalog, sign in to Azure AI Foundry, select your project, and then select Model catalog from the left navigation pane.
In the Collections filter, select Azure OpenAI, and then select a model such as gpt-4o-mini from the Azure OpenAI collection.
You can also initiate deployment by starting from your project in AI Foundry portal, by going to My assets > Models + endpoints and selecting + Deploy model > Deploy base model.
Here are the steps to deploy an Azure OpenAI model from the AI Foundry portal:
- Sign in to Azure AI Foundry.
- Select your project.
- Go to My assets > Models + endpoints.
- Select + Deploy model > Deploy base model.
- Select Azure OpenAI from the Collections filter.
- Select a model such as gpt-4o-mini from the Azure OpenAI collection.
- Specify the deployment name and modify other default settings.
- Select Deploy.
- You land on the deployment details page. Select Open in playground.
After deploying the model, you can test it by clicking "Open in Playground", which will redirect you to the "Chat session" for testing.
Configuring the Copilot
To create a new copilot in Copilot Studio, we need to deploy our model as a new copilot. This feature is currently in preview.
We'll see a pop-up screen asking us to agree to connect our Azure OpenAI subscription with our Copilot tenant, which may lead to data processing outside our Copilot Studio tenant's geographic region.
To test the copilot, we can save the changes and see the same result as when we tested the model in Azure OpenAI.
Using Copilot Studio
Using Copilot Studio is a great way to deploy your model and start working on the copilot. We can create a new copilot and enable the option to boost conversations with generative answers.
To start, we need to create a new index in the Azure AI Search service, which we'll use to store our data. The index name should be "northwind-customers-index" as specified in the Azure AI Search service.
We'll also need to specify the title and content data for our copilot. The title should be "CompanyName" and the content data should include "ContactName, Address, City, Country, Phone".
Here's a summary of the required fields:
- Index name: northwind-customers-index
- Title: CompanyName
- Content data: ContactName, Address, City, Country, Phone
Once we've created our new copilot, we can test it even before publishing it. This is a great way to see how our copilot will perform and make any necessary adjustments before going live.
Configure the Connection
To configure the connection, you'll need to specify the deployment, API version, and other settings. First, select the connection to the Azure OpenAI model, which will allow you to configure the connection properties.
In the General tab, you'll need to specify the following values: Deployment, API version, Maximum tokens in response, Temperature, and Top P. Make sure to add these parameters as numbers, not strings.
Here are the specific values used in the example: Deployment: northwind-model, API version: 2023-06-01-preview, Maximum tokens in response: 800, Temperature: 0, Top P: 1.
Note that the Temperature and Top P parameters can affect the performance and accuracy of the model, so be careful when adjusting these settings.
Model Deployment
To deploy a model, you can start by opening Azure OpenAI Studio and clicking on "Deploy model", then choosing between deploying a base model or a fine-tuned model.
You can also deploy a model from the model catalog in the AI Foundry portal. To do this, sign in to Azure AI Foundry, select your project, and navigate to the Model catalog. From there, select Azure OpenAI and choose a model to deploy.
The deployment process involves configuring settings such as deployment name, model version, and tokens per minute rate limit. You can also set a content filter to control the type of content your model can generate or process. Once you've completed the configuration, click "Deploy" to finalize the deployment.
Model Deployment Quota
Model deployment quota is a crucial aspect to consider when working with Azure OpenAI models.
You receive default quota for most Azure OpenAI models when you sign up for Azure AI Foundry.
This quota is measured in units of Tokens-per-Minute (TPM) and is assigned to your subscription on a per-region, per-model basis.
You can assign TPM to each deployment as it is created, reducing the available quota for that model by the amount you assigned.
You can continue to create deployments and assign them TPMs until you reach your quota limit.
Once you reach your quota limit, you have two options to create new deployments of that model: requesting more quota or adjusting the allocated quota on other model deployments.
Inferencing
Inferencing is a crucial step in model deployment, and Azure OpenAI makes it easy to get started.
You can use the playground, a web-based interface, to interact with the model in real-time and test it with different prompts.
The playground allows you to see the model's responses, giving you a clear understanding of its capabilities.
To consume the deployed model in your application, you can refer to the Azure OpenAI quickstarts for more examples and guidance.
Image Generation
Image generation is a powerful feature that allows you to generate a batch of images from a text caption on a given DALL-E model deployment.
To get started, you'll need to specify the endpoint, deployment ID, and API version. The endpoint is the URL of your Azure OpenAI resource, which should be in the format https://{your-resource-name}.openai.azure.com. The deployment ID is the ID of the DALL-E model that was deployed.
The API version is also required, which can be specified in the query parameter. The prompt is the most important part of the request, as it's the text description of the desired image(s). The maximum length of the prompt is 4,000 characters.
Here are the required parameters:
You can also specify optional parameters such as the number of images to generate, the size of the generated images, and the response format. The number of images can be specified using the 'n' parameter, and the default value is 1. The size of the images can be specified using the 'size' parameter, and the default value is 1024x1024.
Here are the optional parameters:
By specifying these parameters, you can customize the image generation process to suit your needs.
Frequently Asked Questions
What is the Azure deployment name?
An Azure deployment name is a unique identifier between 1 and 64 characters long, consisting of alphanumerics, underscores, parentheses, hyphens, and periods. Learn more about Azure deployment naming conventions for successful deployments.
Sources
- https://learn.microsoft.com/en-us/azure/ai-services/openai/reference
- https://ivanatilca.medium.com/a-step-by-step-guide-to-deploying-open-ai-models-on-microsoft-azure-cab86664fbb4
- https://trailheadtechnology.com/deploying-a-gpt-4o-model-to-azure-openai-service/
- https://forwardforever.com/building-smarter-copilots-with-copilot-studio-and-azure-openai-integration/
- https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-openai
Featured Images: pexels.com