Azure OpenAI Assistants offer a range of capabilities that enable developers to build custom AI solutions.
With Azure OpenAI Assistants, you can integrate AI into your applications to enhance user experience and drive business outcomes.
These assistants are designed to work seamlessly with various Azure services, making it easier to build, deploy, and manage AI models.
They provide pre-trained models that can be fine-tuned for specific use cases, reducing the time and effort required to develop custom AI solutions.
Developers can leverage these assistants to create more personalized, engaging, and efficient experiences for users.
Integrating
To integrate Azure OpenAI assistants, you'll need to set up your Azure Cognitive Services subscription first. Obtain your subscription key and region, as these will be used to configure the SpeechConfig object.
You can save your subscription key as an environment variable for easier access. For example, you can export the key as `cognitive_services_speech_key`.
Once you have your subscription key, import the Azure Cognitive Services Speech SDK and configure the SpeechConfig object with your subscription key and region. For instance, you can use the following code to set up the speech configuration: `speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)`
This configuration allows you to access Azure's speech recognition and synthesis capabilities. You can choose from a variety of voices, which can be found in the Azure documentation.
Here's a summary of the required configuration:
Speech and Synthesis
Azure Cognitive Services provides the SpeechRecognizer class for speech recognition, which allows our voice assistant to recognize user speech from an audio input stream.
The SpeechRecognizer class uses speech_config to process the recognized text, and once recognized, we can use OpenAI's GPT model to generate a response.
To recognize speech, the assistant uses the following code: # Process the recognized text speech_recognizer=speechsdk.SpeechRecognizer(speech_config=speech_config) result=speech_recognizer.recognize_once() if result.reason==speechsdk.ResultReason.RecognizedSpeech: recognized_text=result.text
The generated response is then used to synthesize speech output, which can be played or saved using the SpeechSynthesizer's speak_text_async method.
The SpeechSynthesizer class uses the speech_config to synthesize the generated response into speech, and the result can be played or saved.
Here's an example of the code used to synthesize speech output: # Play or save the synthesized speech speech_synthesizer=speechsdk.SpeechSynthesizer(speech_config=speech_config) result=speech_synthesizer.speak_text_async(generated_response).get() if result.reason==speechsdk.ResultReason.SynthesizingAudioCompleted:
To summarize, our voice assistant uses Azure Cognitive Services' SpeechRecognizer and SpeechSynthesizer classes to recognize user speech and synthesize speech output, respectively.
Azure OpenAI Assistant Tools
Azure OpenAI Assistant Tools are an essential part of creating a custom AI that uses Azure OpenAI models in conjunction with tools. Tools extend chat completions by allowing an assistant to invoke defined functions and other capabilities.
To define a function tool, you'll need to start by defining a function. For example, you could define a function tool that uses the code_interpreter tool to generate visualizations. This function tool would allow the assistant to execute code and generate responses based on the user's input.
With the tool defined, you can include it in the options for a chat completions request. The assistant will then use the tool to fulfill the request, and the response will include one or more "tool calls" that must be resolved via "tool messages" on the subsequent request.
Workflow
The Azure OpenAI Assistant Tools workflow is a series of steps that enable the AI voice assistant to understand and respond to user input.
The process begins with the Speech-to-Text (STT) SpeechRecognizer component from Cognitive Services, which recognizes your speech and language and converts it into text.
The OpenAI component then takes the input from the SpeechRecognizer and generates an intelligent response using a GPT-model, acting as the AI voice assistant component.
The response will be synthesized accordingly into Text-To-Speech (TTS) by the SpeechSynthesizer.
Here's an overview of the workflow components:
Thread Messages
To stream chat messages, you can use the CompleteChatStreaming and CompleteChatStreamingAsync methods, which return a ResultCollection or AsyncCollectionResult containing StreamingChatCompletionUpdate objects.
Streaming chat completions can be iterated over using foreach or await foreach, allowing you to receive updates as new data becomes available from the streamed response.
Once a chat run is complete and the status indicates successful completion, you can list the contents of the thread again to retrieve the model's and any tool responses.
.NET Client Library
The .NET client library is a powerful tool for interacting with Azure OpenAI. It's a companion to the official OpenAI client library for .NET, and it configures a client for use with Azure OpenAI.
The .NET client library provides additional strongly typed extension support for request and response models specific to Azure OpenAI scenarios. This makes it easier to use Azure OpenAI capabilities in your .NET applications.
You can familiarize yourself with different APIs using Samples from OpenAI's .NET library or Azure.AI.OpenAI-specific samples. Most OpenAI capabilities are available on both Azure OpenAI and OpenAI using the same scenario clients and methods.
Here are some of the versions that the .NET client library supports:
monoandroid
monomac
monotouch
tizen40, tizen60
xamarinios
xamarinmac
xamarintvos
xamarinwatchos
The .NET client library makes it easy to get started with Azure OpenAI, and the supported versions give you a lot of flexibility in terms of the .NET frameworks and platforms you can use.
Components
Components are the building blocks of an Azure OpenAI assistant. An Assistant is a custom AI that uses Azure OpenAI models in conjunction with tools.
An Assistant is a custom AI that uses Azure OpenAI models in conjunction with tools.
A Thread is a conversation session between an Assistant and a user. It stores Messages and automatically handles truncation to fit content into a model's context.
A conversation session between an Assistant and a user is stored in a Thread.
A Message is a message created by an Assistant or a user. It can include text, images, and other files. Messages are stored as a list on the Thread.
A message created by an Assistant or a user can include text, images, and other files.
A Run is the activation of an Assistant to begin running based on the contents of the Thread. The Assistant uses its configuration and the Thread's Messages to perform tasks by calling models and tools. As part of a Run, the Assistant appends Messages to the Thread.
A Run is the activation of an Assistant to begin running based on the contents of the Thread.
A Run Step is a detailed list of steps the Assistant took as part of a Run. An Assistant can call tools or create Messages during its run. Examining Run Steps allows you to understand how the Assistant is getting to its final results.
A Run Step is a detailed list of steps the Assistant took as part of a Run.
Required Fields
To properly integrate Azure OpenAI with LibreChat, you need to accurately configure specific fields in your librechat.yaml file. These fields are validated through a combination of custom and environmental variables.
One of the key fields is the validation process, which is based on specific requirements. The librechat.yaml file must be set up correctly to ensure the integration works smoothly.
The fields must be accurately configured to ensure the correct setup. This involves a combination of custom and environmental variables.
A critical field to configure is the librechat.yaml file, which requires specific fields to be set up correctly. These fields are validated through a combination of custom and environmental variables.
The librechat.yaml file must be configured with the correct fields to ensure the integration is successful. This involves a combination of custom and environmental variables.
Global Settings
Global settings allow you to customize the behavior of the Azure OpenAI assistant. You can specify the model to use for generating conversation titles with the titleModel setting.
The titleModel setting can be set to a specific model, such as "gpt-3.5-turbo", or dynamically use the current model by setting it to "current_model". If not provided, the default model is set to "gpt-3.5-turbo", which will result in no titles if lacking this model.
You can also enable conversation summarization for all Azure models by setting summarize to true. This will activate summarization, but you can also specify the model to use for generating conversation summaries with the summaryModel setting.
Here's a summary of the global settings:
Note that some of these settings have default values, so you may not need to specify them if you're happy with the default behavior.
Deploying Models
To deploy models, you'll need to navigate to the Azure OpenAI Studio and click on the "Deployments" page. From there, you can create a new deployment by clicking on the "+" button.
You can deploy the gpt-4 model by selecting it in the "Select a model" dropdown and entering a deployment name, such as "gpt-4". You can also select the "Standard" deployment type and click on the "Create" button.
You can also deploy the gpt-4-vision-preview model by selecting it in the "Select a model" dropdown and entering a deployment name, such as "gpt-4-vision". This will enable vision (image analysis) with Azure OpenAI.
Here's a list of the deployment steps:
- Click on the "Deployments" page in Azure OpenAI Studio
- Create a new deployment by clicking on the "+" button
- Select the gpt-4 model and enter a deployment name, such as "gpt-4"
- Select the "Standard" deployment type and click on the "Create" button
Note that you can also deploy the gpt-4-vision-preview model by following the same steps, but selecting the gpt-4-vision-preview model in the "Select a model" dropdown.
Using Plugins
To use Plugins with Azure OpenAI, you need a deployment supporting function calling.
You'll also need to set "Functions" off in the Agent settings if you're not using function calling mode.
It's recommended to have "skip completion" off as well, which is a review step of what the agent generated.
Make sure the field "plugins" is set to true in your Azure OpenAI endpoint config.
This will configure Plugins to use Azure models.
The current configuration through librechat.yaml uses the primary model you select from the frontend for Plugin use, which is not usually how it works without Azure.
In that case, the "Agent" model setting can be ignored when using Plugins through Azure.
Generate Images with DALL-E
To generate images with DALL-E, you'll need to create an Azure resource that hosts the model. This will serve as the foundation for your image generation capabilities.
You'll then need to deploy the image generation model in one of the available regions. Currently, there are two options: East US and Sweden Central.
Here are the specific details for each region:
Once you've set up your Azure resource and deployed the model, you'll need to configure your environment variables based on your Azure credentials. This will allow you to access the DALL-E capabilities.
Special Considerations
As you set up Azure OpenAI services with LibreChat, there are some special considerations to keep in mind. Unique names are crucial, and both model and group names must be unique across the entire configuration to avoid validation failures.
Duplicate names will lead to errors, so make sure to give each model and group a distinct name. Missing required fields can also cause issues, such as a lack of deploymentName or version, which can result in validation errors unless the group represents a serverless inference endpoint.
Environment Variable References are supported, but it's essential to ensure that all referenced variables are present in your environment to avoid runtime errors. If an environment variable is referenced but not defined, it will cause errors.
You can use environment variable references in the configuration, such as ${VARIABLE_NAME}, but be aware that ${INSTANCE_NAME} and ${DEPLOYMENT_NAME} are unique placeholders that correspond to the instance and deployment name of the currently selected model. It's not recommended to use INSTANCE_NAME and DEPLOYMENT_NAME as environment variable names to avoid potential conflicts.
Any issues in the config, such as duplicate names, undefined environment variables, or missing required fields, will invalidate the setup and generate descriptive error messages to aid in prompt resolution. You won't be allowed to run the server with an invalid configuration.
Model identifiers are also important, and an unknown model can be used as a model identifier, but it must match a known model to reflect its known context length, which is crucial for message/token handling. For example, gpt-7000 will be valid but default to a 4k token limit, whereas gpt-4-turbo will be recognized as having a 128k context limit.
Here are some key points to remember:
- Unique names are required for model and group names.
- Missing required fields can cause validation errors.
- Environment Variable References must be defined in your environment.
- Undefined environment variables will cause errors.
- Model identifiers must match a known model to reflect its known context length.
Troubleshooting
Interacting with Azure OpenAI using the .NET SDK can be a bit tricky, but don't worry, I've got you covered.
If you try to create a client using an endpoint that doesn't match your Azure OpenAI Resource endpoint, a 404 error is returned, indicating Resource Not Found.
Errors returned by the service correspond to the same HTTP status codes returned for REST API requests. This means you can use your knowledge of HTTP status codes to troubleshoot issues.
A 404 error is a clear indication that something is off, so make sure to double-check your endpoint URL.
Azure OpenAI will return the same error codes for the .NET SDK as it would for a REST API request, so you can use this to your advantage when debugging.
Frequently Asked Questions
What is the function of Azure OpenAI Assistant API?
The Azure OpenAI Assistant API enables developers to integrate AI capabilities into their applications, supporting code execution, file search, and function calls through the Assistants playground or quickstart integration. Explore its capabilities and build your own integration to unlock the full potential of this powerful API.
What is the difference between Azure OpenAI and OpenAI?
OpenAI is open to the public for experimentation, while Azure OpenAI is exclusive to businesses with a Microsoft Enterprise agreement, offering flexible pricing options
How to get Azure OpenAI API key?
To obtain your Azure OpenAI API key, navigate to the "Keys and Endpoint" section of your newly created resource after deployment. From there, you can copy the necessary keys to get started with Azure OpenAI.
Sources
- https://graef.io/building-your-own-gpt-powered-ai-voice-assistant-with-azure-cognitive-services-and-openai/
- https://www.nuget.org/packages/Azure.AI.OpenAI
- https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/assistant
- https://medium.com/@ganeshneelakanta/lab06-get-started-using-azure-openai-assistants-preview-064a0234d776
- https://www.librechat.ai/docs/configuration/azure
Featured Images: pexels.com