The Azure AI Studio API offers a comprehensive set of tools for building, deploying, and managing AI models.
You can use the Azure AI Studio API to integrate AI capabilities into your applications and services.
The API provides access to a wide range of AI services, including computer vision, natural language processing, and predictive analytics.
These services are designed to help you solve complex problems and make informed decisions.
API Basics
API Basics are a fundamental concept in building and integrating applications. An API, or Application Programming Interface, is a set of defined rules that enables different systems to communicate with each other.
APIs are used to connect different services and systems, allowing them to share data and functionality. This is a key feature of Azure AI Studio API, which enables developers to build and deploy AI-powered applications.
APIs can be classified into two main types: RESTful APIs and SOAP APIs. RESTful APIs are more commonly used in modern web development, while SOAP APIs are often used in enterprise environments.
API Specs
APIs are the backbone of any modern application, and understanding their specifications is crucial for successful development.
API specs are essentially the blueprints for your API, outlining the available endpoints, methods, and data formats. In the context of Azure OpenAI, there are three primary API surfaces: control plane, data plane - authoring, and data plane - inference.
Each API surface has its own unique set of capabilities and API releases. Preview releases tend to follow a monthly cadence, with the latest preview release being 2024-06-01-preview for the control plane API.
The control plane API is used for tasks like creating Azure OpenAI resources, model deployment, and resource management. It also governs what is possible to do with capabilities like Azure Resource Manager, Bicep, Terraform, and Azure CLI.
The data plane authoring API controls fine-tuning, file-upload, ingestion jobs, and batch queries, while the data plane inference API provides inference capabilities for features like completions, chat completions, and embeddings.
Here's a summary of the API surfaces and their latest releases:
The control plane API has a stable release of 2024-10-01, while the data plane APIs have stable releases of 2024-10-21.
Function Call Option
You can specify a particular function to be called by the model via a JSON object with a "name" property. The name of the function must be a string and can contain letters, numbers, underscores, and dashes, with a maximum length of 64 characters.
The "name" property is required when specifying a function to be called. It's used by the model to determine which function to execute.
You can also control which tool is called by the model using the "tool_choice_option". This option can be set to "none" to prevent the model from calling any tool, "auto" to allow the model to choose between generating a message or calling a tool, or "required" to force the model to call a tool.
Here are the possible values for the "tool_choice_option":
By specifying a particular function to be called or controlling the tool choice option, you can customize the behavior of the model and generate more accurate and relevant responses.
Authentication
Authentication is a crucial step in using the Azure AI Studio API. You can authenticate using either API Keys or Microsoft Entra ID.
API Key authentication requires including the API Key in the api-key HTTP header, as shown in the Quickstart guide. This is a straightforward process, but it's essential to remember that all API requests must include the API Key.
Microsoft Entra ID authentication uses a token, which must be preceded by Bearer, such as Bearer YOUR_AUTH_TOKEN. This is a more secure option, but it requires setting up a Microsoft Entra ID account.
There are several authentication types supported by the Azure AI Studio API, including API key, connection string, system-assigned managed identity, and user-assigned managed identity.
Here's a breakdown of the authentication types:
To use API key authentication with the Azure AI Studio API, you'll need to specify the authentication type and provide the API key. This can be done using the onYourDataApiKeyAuthenticationOptions, which includes the type and key parameters.
Model Inference
The Azure AI model inference service offers access to powerful models from leading providers like OpenAI, Microsoft, Meta, and more. These models support tasks such as content generation, summarization, and code generation.
You can use the Azure AI model inference service to access models like OpenAI's, which are available through the Azure OpenAI data plane inference specification. This specification is updated regularly, with the latest GA release being 2024-10-21.
To use the model inference service, you'll need to install the azure-ai-inferencing client library. This will give you access to a configured and authenticated ChatCompletionsClient or EmbeddingsClient, which you can use to interact with the models.
Here are some key features of the Azure AI model inference service:
The Azure AI model inference service supports a range of models, including those from OpenAI, Microsoft, and Meta. You can change the model name to any model that you deployed to the inference service or Azure OpenAI service.
Plane Inference
The data plane inference is a key concept in Azure OpenAI, and it's been updated in the latest GA release, which is dated 2024-10-21.
For those who want the latest information, you can refer to the latest preview data plane inference API, which is available for developers to explore.
The serverless API endpoint is a great way to deploy models in Azure Machine Learning and Azure AI Studio, allowing developers to consume predictions from a diverse set of models in a uniform and consistent way.
Model Subscription
Model Subscription is a crucial step in deploying models as serverless API endpoints. You can subscribe to a model offering through the Azure Marketplace, which allows you to control and monitor spending.
To create a model subscription, you can use the Azure CLI command `az ml marketplace-subscription create -f subscription.yml`. This command uses a YAML file to specify the model ID and subscription name.
You can also use a Bicep configuration to create a model subscription, as shown in the example: `resource projectName_subscription 'Microsoft.MachineLearningServices/workspaces/marketplaceSubscriptions@2024-04-01-preview' = if (!startsWith(modelId, 'azureml://registries/azureml/')) { ... }`. This configuration creates a new resource to track costs associated with the model's consumption.
Once you subscribe to a model offering, subsequent deployments of the same offering in the same project don't require subscribing again.
You can see the model offers to which your project is currently subscribed using the Azure CLI command `az ml marketplace-subscription list`. This command lists all the model subscriptions associated with your project.
Here's a summary of the steps to create a model subscription:
If you no longer need a model subscription, you can delete it using the Azure portal or Azure CLI. Deleting a model subscription makes any associated endpoint become Unhealthy and unusable.
Model Inference Service
The Model Inference Service is a powerful tool that allows you to access a wide range of models from leading providers like OpenAI, Microsoft, Meta, and more.
You can use the Azure AI model inference service to perform tasks such as content generation, summarization, and code generation.
To get started, ensure that your project has an AI Services connection in the management center.
You can install the azure-ai-inferencing client library to use the project client to get a configured and authenticated ChatCompletionsClient or EmbeddingsClient.
To connect to your deployment and endpoint, you need the endpoint and credentials, and the parameter model_name is not required for endpoints serving a single model, like Managed Online Endpoints.
Alternatively, you can use Microsoft Entra ID to create the client, but make sure that the endpoint was deployed with that authentication method and that you have the required permissions to invoke it.
If you're planning to use asynchronous calling, it's a best practice to use the asynchronous version for the credentials.
Here are some client libraries available for Azure AI Services:
- Azure AI services SDKs
- Azure AI services REST APIs
- Azure AI Services Python Management Library
- Azure AI Search Python Management Library
You can use the complete endpoint for text completion, and rather than adding parameters to each chat or completion call, you can set them at the client instance.
For parameters that are not supported by the Azure AI model inference API but are available in the underlying model, you can use the model_extras argument.
Evaluation
To use the Azure AI evaluation service, you can easily connect to it using the project client. This allows you to run your evaluators with the necessary models.
You can instantiate a ViolenceEvaluator by using the project.scope parameter.
Note that your project needs to be in a specific region to run violence evaluators: East US 2, Sweden Central, US North Central, or France Central.
Request and Response
The Azure AI Studio API has a robust request and response system that allows for seamless communication between the client and server. This system is built around several key components, including the createCompletionRequest and createChatCompletionResponse functions.
These functions are used to create a completion response from the API, which includes a unique identifier, a list of completion choices, and a Unix timestamp of when the completion was created. The response also includes the model used for completion and a system fingerprint that represents the backend configuration of the model.
The createChatCompletionResponse function is similar, but it includes a list of chat completion choices and a Unix timestamp of when the chat completion was created. It also includes the model used for the chat completion and a system fingerprint.
The chatCompletionRequestMessageTool function represents a chat completion response returned by the model, based on the provided input. The role of the author of the response message is also included.
The chatCompletionStreamResponseDelta function represents a streamed chunk of a chat completion response returned by the model. It includes a unique identifier, a list of chat completion choices, and a Unix timestamp of when the chat completion was created.
The chatCompletionMessageToolCall function represents a tool call in a chat completion response. It includes the ID of the tool call, the type of the tool call, and the function that the model called.
The chatCompletionFunctionCall function is deprecated and replaced by tool_calls. It includes the name of the function to call and the arguments to call the function with, as generated by the model in JSON format.
The chatCompletionStreamOptions function includes options for streaming response, such as including usage statistics and specifying the format of the response.
Here are some key differences between the createCompletionRequest and createChatCompletionResponse functions:
Note that the system fingerprint is included in both functions, but it is not required.
Functionality
The Azure AI Studio API offers a range of functionality that makes it easy to build and deploy AI-powered applications. You can use the serverless API endpoint to access the Azure AI Model Inference API, which provides a common set of capabilities for foundational models.
The API supports a variety of models, including those from OpenAI, Microsoft, and other leading providers. You can use the Azure OpenAI Service to access OpenAI's models, including the GPT-4o, GPT-4o mini, and GPT-4 models, with the added benefits of Azure's data residency, scalability, safety, security, and enterprise capabilities.
To get started, you'll need to ensure your project has an AI Services connection, and then install the azure-ai-inferencing client library. From there, you can use the project client to get a configured and authenticated ChatCompletionsClient or EmbeddingsClient.
Stream Options
Stream Options are crucial when it comes to streaming responses, and you can control them using the chatCompletionStreamOptions.
The include_usage option allows you to include token usage statistics for the entire request, which can be useful for understanding how your requests are being processed.
You can choose from several formats for streaming responses, including json, text, srt, verbose_json, and vtt.
If you want to include token usage statistics, you can set the include_usage option to true. This will include an additional chunk in the response with the usage field showing the token usage statistics for the entire request.
Functions
Functions are the building blocks of functionality, allowing you to create custom actions and behaviors in your application.
To define a function, you'll need to provide a name, which must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64 characters. This name will be used by the model to choose when and how to call the function.
A function can have parameters, which are described as a JSON Schema object. Omitting parameters defines a function with an empty parameter list.
Here's a breakdown of the required and optional fields for defining a function:
You can specify a particular function via {"name": "my_function"} to force the model to call that function.
In a chat, a function message has a specific format, with a role of "function", content, and name of the function to call. The name of the function to call is also required when specifying a particular function via the chatCompletionFunctionCallOption or chatCompletionRequestFunctionMessage.
Usage
The Azure AI Model Inference API provides a common set of capabilities for foundational models, allowing developers to consume predictions from a diverse set of models in a uniform and consistent way.
You can use the serverless API endpoint to deploy models in Azure Machine Learning and Azure AI Studio, which supports the Azure AI Model Inference API.
The completionUsage section of the API provides usage statistics for the completion request, which includes the number of tokens in the prompt, generated completion, and total tokens used.
The completionUsage section includes the following parameters: prompt_tokens, completion_tokens, total_tokens, and completion_tokens_details. All of these parameters are required.
Here is a breakdown of the completionUsage parameters:
You can also use the model_extras argument to pass extra parameters that are not supported by the Azure AI model inference API but are available in the underlying model.
Search Query Type
When working with search queries in Azure OpenAI, you have four options to choose from. The type of query you select will determine how your search results are generated.
The simple query parser is the default option, which is represented by the "simple" value in the AzureSearchQueryType enum. This type of query is straightforward and easy to use.
You can also use the semantic query parser, which is represented by the "semantic" value. This option is ideal for advanced semantic modeling and provides more sophisticated search results.
Vector search is another option, represented by the "vector" value. This type of query is useful for searching over computed data.
If you're looking for a combination of simple and vector search, you can use the "vector_simple_hybrid" value. This option allows you to leverage the strengths of both simple and vector search.
Alternatively, you can use the "vector_semantic_hybrid" value, which combines semantic search with vector data querying.
Here's a quick rundown of the different search query types:
Agent Service
The Azure AI Agent Service is a fully managed service designed to empower developers to securely build, deploy, and scale high-quality, and extensible AI agents.
Using an extensive ecosystem of models, tools, and capabilities from OpenAI, Microsoft, and third-party providers, Azure AI Agent Service enables building agents for a wide range of generative AI use cases.
With Azure AI Agent Service, developers can build agents that can be deployed in a variety of settings, including those that require high-quality and extensibility.
Azure AI Agent Service is fully managed, which means developers can focus on building and deploying their agents without worrying about the underlying infrastructure.
Frequently Asked Questions
Is Azure AI Studio free?
Yes, Azure AI Studio is free to use and explore, with no need for an Azure account. However, individual features may incur normal billing rates.
Is Azure AI Studio still in preview?
No, Azure AI Studio is no longer in preview. It has transitioned to general availability, announced at the Microsoft Build 2024 developer conference.
What is Azure AI Studio?
Azure AI Studio is a comprehensive platform for developing and deploying generative AI apps and APIs responsibly. It enables users to build copilots and AI solutions faster with prebuilt and customizable models.
Sources
- https://learn.microsoft.com/en-us/azure/ai-services/openai/reference
- https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-serverless
- https://docs.llamaindex.ai/en/stable/examples/llm/azure_inference/
- https://learn.microsoft.com/en-us/azure/ai-studio/how-to/develop/sdk-overview
- https://www.codecademy.com/article/getting-started-with-azure-open-ai-service
Featured Images: pexels.com