Azure OpenAI Embeddings are a powerful tool for efficient AI development. They enable developers to generate dense vector representations of text inputs, which can be used for a wide range of tasks such as text classification, clustering, and retrieval.
These embeddings are based on the transformer architecture, which is a type of neural network that's particularly well-suited for natural language processing tasks. They're also highly scalable, making them a great choice for large-scale AI applications.
One of the key benefits of Azure OpenAI Embeddings is that they're pre-trained on a massive dataset of text, which means you can start using them right away without having to spend time and resources on training your own model. This can save you a significant amount of time and effort, and get you up and running with your AI project more quickly.
Getting Started
To get started with Azure OpenAI Embeddings, you'll need to have an Azure subscription, which can be obtained for free.
The Azure OpenAI Embeddings service uses a combination of natural language processing (NLP) and computer vision to generate embeddings, which are numerical representations of text or image data.
First, create an Azure OpenAI Embeddings resource in the Azure portal, which will provide you with a unique endpoint URL and API key.
You can then use the Azure OpenAI Embeddings SDK to interact with the service, including uploading text or image data for embedding.
The Azure OpenAI Embeddings service supports a variety of input formats, including text, images, and audio.
Before you start using Azure OpenAI Embeddings, make sure you have the necessary permissions and access controls in place to manage your embeddings.
Configuration Options
You can configure Azure OpenAI embeddings by using the AzureOpenAiEmbeddingOptions class, which provides a builder to create options. This class allows you to customize settings for embedding requests.
To set default options for all embedding requests, you can use the AzureOpenAiEmbeddingModel constructor at startup time. This sets the default options for all subsequent requests.
At runtime, you can override the default options by passing a custom AzureOpenAiEmbeddingOptions instance to the EmbeddingRequest request. This allows you to tailor the options for specific requests, such as overriding the default model name.
To override the default model name for a specific request, you can create an AzureOpenAiEmbeddingModel instance and use it to compute the similarity between two input texts.
Setting Up OpenAI
To set up OpenAI for embeddings, you'll need to follow the detailed steps outlined in the Azure documentation. This will ensure a smooth integration and optimal performance.
First, you'll need to establish connections, which is crucial for utilizing the embedding tools effectively. You can find a structured overview of the required connections in the Azure documentation.
Here's a breakdown of the required connections:
Once you've set up your connections, you can access the embeddings through the LangChain framework, which is available after deploying an Azure OpenAI instance.
Setting Up OpenAI for Embeddings
To set up OpenAI for embeddings, you'll need to create an Azure account and get an API key. This will allow you to access Azure OpenAI embedding models through the LangChain framework.
You'll also need to have an Azure OpenAI instance deployed, which can be done by following the guide on the Azure documentation site. Once your instance is set up, you can find the API key in the Azure Portal, under the "Keys and Endpoint" section of your instance.
The number of dimensions for the embeddings can be specified, but this depends on the underlying model supporting it. This is a crucial step in setting up OpenAI for embeddings, as it will affect the output of your embeddings.
Here are the steps to deploy an Azure OpenAI instance:
- Create an Azure account
- Get an API key
- Deploy an Azure OpenAI instance using the guide on the Azure documentation site
- Find the API key in the Azure Portal under the "Keys and Endpoint" section of your instance
Note that the LangChain framework is used to access Azure OpenAI embedding models, and the API key is required for authentication.
Import Connector
Importing the Azure OpenAI connector is a crucial step in setting up OpenAI.
You'll need to import the ballerinax/azure.openai.embeddings module into your Ballerina project.
Here's a simple way to do it:
- Open your Ballerina project.
- Import the ballerinax/azure.openai.embeddings module.
Once you've imported the module, you'll be able to access the Azure OpenAI embeddings through the LangChain framework.
Using OpenAI Embeddings
To access Azure OpenAI embedding models, you'll need to create an Azure account and get an API key, which can be found in the Azure Portal under the "Keys and Endpoint" section of your instance.
You can deploy a version of Azure OpenAI on the Azure Portal following the guide at https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal.
The Azure OpenAI embedding model integration can be created by installing the langchain-openai integration package.
The number of dimensions for the embeddings can be specified only if the underlying model supports it.
Here's a list of the properties you can configure to connect to Azure OpenAI:
The prefix spring.ai.azure.openai is the property prefix to configure the connection to Azure OpenAI, and the prefix spring.ai.azure.openai.embedding is the property prefix that configures the EmbeddingModel implementation for Azure OpenAI.
To effectively set up Azure OpenAI for embeddings, follow the detailed steps to ensure a smooth integration and optimal performance.
You can call out to OpenAI's embedding endpoint async for embedding query text using the async function.
You can also call out to OpenAI's embedding endpoint for embedding search docs using the search_docs function.
Azure OpenAI embeddings offer scalability, performance, and easy integration with existing LangChain workflows.
The embeddings generated are optimized for performance, ensuring quick response times for applications.
Frequently Asked Questions
What is the embedding limit for Azure OpenAI?
For Azure OpenAI, the maximum input text length is 8,192 tokens, and the maximum array size for a single embedding request is 2048
What is the name of the Azure OpenAI embedding model?
The Azure OpenAI embedding model is named after its type, such as "text-embedding-ada-002". This naming convention indicates the model's purpose and architecture.
What is an embedding model in OpenAI?
An embedding model is a technique that converts words into numerical vectors, capturing their semantic meaning and relationships in a vector space. This allows for more accurate and efficient text analysis and comparison.
How to get OpenAI API key in Azure?
To get your OpenAI API key in Azure, navigate to the "Create a resource" section and follow the prompts to create a new Azure OpenAI resource. Once deployed, access the "Keys and Endpoint" section to obtain your API key.
Sources
- https://python.langchain.com/api_reference/openai/embeddings/langchain_openai.embeddings.azure.AzureOpenAIEmbeddings.html
- https://docs.spring.io/spring-ai/reference/api/embeddings/azure-openai-embeddings.html
- https://www.restack.io/p/embeddings-knowledge-azure-openai-answer-cat-ai
- https://central.ballerina.io/ballerinax/azure.openai.embeddings/latest
- https://cscblog.ethz.ch/index.php/2024/02/06/az-open-ai-rag-chromadb-langchain/
Featured Images: pexels.com