GPT-4 is a powerful language model developed by OpenAI that's now available on Azure. This model is a significant upgrade from its predecessors, with a massive 175 billion parameters.
It's designed to understand and generate human-like language, making it a game-changer for applications like chatbots, virtual assistants, and language translation tools. GPT-4 is also more accurate and reliable than its predecessors, with a 90% reduction in errors.
The integration of GPT-4 with Azure provides a scalable and secure platform for developers to build and deploy AI-powered applications. With Azure's robust infrastructure, developers can easily integrate GPT-4 into their projects and take advantage of its advanced capabilities.
GPT-4 Mini Features
GPT-4o mini now comes with Azure AI Content Safety features, including prompt shields and protected material detection, which are enabled by default.
Safety is paramount for productive use and trust, so it's great to see these features in place.
Azure AI Content Safety is already supporting developers across industries, including game development, tax filing, and education.
This means you can maximize the advancements in model speed while not compromising safety.
The Azure AI Content Safety capabilities have been improved with the introduction of an asynchronous filter.
GPT-4o mini is now available using global pay-as-you-go deployment, which is significantly cheaper than previous frontier models.
You can pay 15 cents per million input tokens and 60 cents per million output tokens, making it flexible for variable workloads.
Traffic is routed globally to provide higher throughput, and you still get control over where data resides at rest.
Global pay-as-you-go deployments also allow you to upgrade from existing models to the latest models in the same region.
GPT-4o mini offers 15M tokens per minute (TPM) throughput, and GPT-4 offers 30M TPM throughput.
Azure OpenAI Service offers GPT-4o mini with 99.99% availability and the same industry-leading speed as OpenAI.
GPT-4o mini is now available on Azure AI Batch service, which delivers high-throughput jobs with a 24-hour turnaround at a 50% discount rate.
Fine-tuning for GPT-4o mini is also being released, allowing you to customize the model for your specific use case and scenario.
This makes Azure OpenAI Service fine-tuned deployments the most cost-effective offering for customers with production workloads.
Data and Residency
Azure AI offers data residency for all 27 regions, giving customers flexibility and control over where their data is stored and processed.
This means customers can meet their unique compliance requirements with a complete data residency solution.
Regional pay-as-you-go and Provisioned Throughput Units (PTUs) offer control over both data processing and data storage.
Azure OpenAI Service is now available in 27 regions, including Spain, which launched earlier this month as our ninth region in Europe.
Pricing and Availability
GPT-4o Azure offers a global pay-as-you-go deployment option, making it flexible for variable workloads.
The rate limit for GPT-4o model is set to 1,000 tokens per minute, which translates to 6 requests per minute (RPM).
You can configure the model version, deployment type, and name when setting up a new deployment in Azure AI Studio.
GPT-4o mini is now available using the global pay-as-you-go deployment at 15 cents per million input tokens and 60 cents per million output tokens.
This is significantly cheaper than previous frontier models, making it a cost-efficient option.
The global pay-as-you-go deployment offers customers the highest possible scale, with 15M tokens per minute (TPM) throughput for GPT-4o mini and 30M TPM throughput for GPT-4o.
Azure OpenAI Service offers GPT-4o mini with 99.99% availability and the same industry leading speed as OpenAI.
Batch service is now available for GPT-4o mini, delivering high throughput jobs with a 24-hour turnaround at a 50% discount rate by using off-peak capacity.
Fine-tuning for GPT-4o mini is also available, allowing customers to further customize the model for their specific use case and scenario.
Model Outputs and Testing
The Playground in Azure OpenAI Studio offers a dynamic environment for testing and fine-tuning AI models, allowing users to experiment with different configurations and optimize their AI models for various applications.
You can use the setup panel on the left to define system messages, use templates, and add examples to guide the AI's responses, making it easier to get the desired output.
The configuration panel on the right lets you select your deployment and adjust session settings, such as the number of past messages included and the current token count, giving you more control over the model's behavior.
By testing the model in the Playground, you can see firsthand how it handles mathematical queries, as demonstrated by prompting the model to print the first 100 prime numbers.
Default Safety to GPT-4 Mini
Default safety is now a priority for GPT-4o mini on Azure OpenAI Service. Azure AI Content Safety features, including prompt shields and protected material detection, are now enabled by default.
This means you can rely on these safety features to safeguard your generative AI applications. Azure AI Content Safety is already supporting developers in various industries, such as game development, tax filing, and education.
The throughput and speed of Azure AI Content Safety have been improved, thanks to the introduction of an asynchronous filter. This allows you to maximize the advancements in model speed without compromising safety.
Microsoft's Customer Copyright Commitment applies to GPT-4o mini, giving customers peace of mind that they're protected against third-party intellectual property claims for output content.
Two Flavors of Structured Outputs
Structured Outputs is a feature that allows developers to specify the exact output format they want from an AI model. This is a game-changer for testing and validation purposes.
There are two flavors of Structured Outputs to choose from.
One option is to use a User-defined JSON Schema, which is supported by both GPT-4o-2024-08-06 and GPT-4o-mini-2024-07-18 models. This allows for a high degree of customization and flexibility.
Another option is More Accurate Tool Output, also known as "Strict Mode", which is supported by all models that support function calling, including GPT-3.5 Turbo, GPT-4, GPT-4 Turbo, and GPT-4o models from June 2023 onwards. This limited version lets developers define specific function signatures for tool use.
Here's a brief comparison of the two options:
Model Testing in Playground
The Playground in Azure OpenAI Studio is a dynamic environment that allows you to test and fine-tune AI models.
You can use the setup panel on the left to define system messages, use templates, and add examples to guide the AI's responses.
The configuration panel on the right lets you select your deployment and adjust session settings.
You can prompt the model to perform mathematical queries, like printing the first 100 prime numbers, to showcase its ability to handle such tasks.
This interactive playground is invaluable for developers to experiment with different configurations and optimize their AI models for various applications.
Model Deployment and Services
Model deployment on Azure AI Services is a straightforward process. To create a chat playground or use any of the Azure AI services, you must create at least one model deployment on Azure AI Studio.
To deploy a model, click on the "Deployments" sub-tab under the "Shared resources" tab on the navigation menu on the Azure AI Studio, and select a GPT model from the model catalogue. You can also select the Global Standard deployment type to leverage Azure's global infrastructure and dynamically route traffic to the data center with best availability for each request.
You can deploy various AI models on Azure AI Studio, including GPT-3, GPT-3.5, GPT-4, and GPT-4o. These models process and produce text that sounds human, making them useful for finishing textual exchanges, modelling conversational dynamics, and producing answers that are quite similar to human output.
Here are some of the AI models available on Azure AI Studio:
- Dall-E models – Generative AI used to generate Images/Pictures from user text input.
- GPT/completion models – Generative AI used for chat completions, predicts and generates text to continue a given prompt.
- Embeddings models – Generative AI used to generate numerical representations (embeddings) of text, capturing its semantic meaning.
Model Deployment
Model deployment is a crucial step in bringing AI models to life. You can select from a comprehensive list of AI models in the Azure AI Studio, with availability depending on the location chosen during creation.
To deploy a model, you'll need to configure key parameters, including the model version, deployment type, and name. Be mindful of the rate limit, which dictates the number of tokens processed per minute. For example, setting the rate limit to 1,000 tokens per minute translates to 6 requests per minute (RPM).
The dynamic quota feature allows Azure to automatically adjust the rate limits based on demand and resource availability. This feature is enabled by default, ensuring optimal performance and scalability.
You can deploy new language models using Azure OpenAI Service, which begins with securing access and creating the necessary resources in the Azure portal. Once resources are set up, you can navigate the updated Azure AI Studio to select and configure models like GPT-4o.
Here's a summary of the key deployment options:
- GPT-4o model supports multimodal capabilities, enabling it to handle both text and image inputs.
- The Global Standard deployment type allows you to leverage Azure's global infrastructure to dynamically route traffic to the data center with best availability for each request.
- The dynamic quota feature automatically adjusts the rate limits based on demand and resource availability.
To finalize the deployment details, you'll need to fill up the deployment details for the GPT-4o models and click the "Deploy" button. Selecting the Global Standard deployment type will provide the highest default quota for new models and eliminate the need to load balance across multiple resources.
Deploy Real-Time Audio Model
To deploy a real-time audio model, you need to have a deployment of the gpt-4o-realtime-preview model in a supported region.
You can deploy the model by going to the AI Foundry home page and making sure you're signed in with the Azure subscription that has your Azure OpenAI Service resource. Select the Real-time audio playground from under Resource playground in the left pane and then select + Create a deployment to open the deployment window.
To deploy the model, search for and select the gpt-4o-realtime-preview model and then select Confirm. Make sure to select the 2024-10-01 model version in the deployment wizard.
You can deploy the model in the East US 2 and Sweden Central regions. The gpt-4o-realtime-preview model is available for global deployments in these regions.
Here are the steps to deploy the model:
- Go to the AI Foundry home page and make sure you're signed in with the Azure subscription that has your Azure OpenAI Service resource.
- Select the Real-time audio playground from under Resource playground in the left pane.
- Select + Create a deployment to open the deployment window.
- Search for and select the gpt-4o-realtime-preview model and then select Confirm.
- In the deployment wizard, make sure to select the 2024-10-01 model version.
- Follow the wizard to deploy the model.
Frequently Asked Questions
Is GPT 4 Turbo available on Azure?
Yes, GPT-4 Turbo is available on Azure, with the latest vision-capable models in public preview. You can access GPT-4 Turbo through the Azure OpenAI Service, including the GA model gpt-4 version turbo-2024-04-09.
What is the difference between GPT-4o and 4o mini?
GPT-4o mini is a more affordable version of GPT-4o, offering a balance of performance and cost-efficiency. It's a smaller, yet still powerful, alternative to the original GPT-4o model.
Sources
- https://azure.microsoft.com/en-us/blog/openais-fastest-model-gpt-4o-mini-is-now-available-on-azure-ai/
- https://azure.microsoft.com/en-us/blog/announcing-a-new-openai-feature-for-developers-on-azure/
- https://trailheadtechnology.com/deploying-a-gpt-4o-model-to-azure-openai-service/
- https://learn.microsoft.com/en-us/azure/ai-services/openai/realtime-audio-quickstart
- https://techcommunity.microsoft.com/t5/educator-developer-blog/deploying-gpt-4o-ai-chat-app-on-azure-via-azure-ai-services-a/ba-p/4179472
Featured Images: pexels.com