Getting started with Azure Search OpenAI demo can seem daunting, but it's actually quite straightforward. You'll need to have an Azure subscription, as well as an Azure Search service and an OpenAI account.
The first step is to create a new Azure Search service, which will provide a search index for your data. This index will be used by OpenAI to provide search results.
To create the Azure Search service, you'll need to navigate to the Azure portal and click on the "Create a resource" button. From there, you can select "Azure Search" and follow the prompts to create a new service.
With your Azure Search service in place, you can then create an OpenAI account and connect it to your Azure Search service. This will allow you to use OpenAI's search capabilities with your Azure Search data.
A different take: How to Create a Blob Storage in Azure
Service Configuration
To use Azure OpenAI On Your Data fully, you need to set one or more Azure RBAC roles.
You'll also want to set up Azure Role-based access controls (Azure RBAC) for adding data sources, which is a crucial step in securely using Azure OpenAI On Your Data.
For more information, see Use Azure OpenAI On Your Data securely.
See what others are reading: What Is the Data Storage in Azure Called
Getting Started
To get started with service configuration, you can try the RAG quickstart for a demonstration of query integration with chat models over a search index. This will give you a hands-on experience with how it works.
The RAG quickstart is a great place to start because it's a demonstration, not a full-fledged implementation. You can use it to get a feel for how the different components work together.
If you want to dive deeper, you can also check out the tutorial on how to build a RAG solution in Azure AI Search. This tutorial will walk you through the features and patterns for RAG solutions that obtain grounding data from a search index.
Start with solution accelerators, which are pre-built templates that deploy Azure resources, code, and sample grounding data. You can use these templates to get an operational chat app up and running in as little as 15 minutes.
For another approach, see: Tutorial on Azure Storage
Here are some solution accelerators you can use:
- Azure-search-openai-demo: This is the code for the templates, and it's featured in several presentations.
- Language-specific versions: You can find language-specific versions of the templates using the following links.
Before you start, you should review indexing concepts and strategies to determine how you want to ingest and refresh data. This will help you decide whether to use vector search, keyword search, or hybrid search.
Enabling Authentication
You'll want to enable authentication for your Azure web app to control who can access your indexed data. By default, there's no authentication or access restrictions enabled, so anyone with a routable network connection can chat with your data.
To require authentication, follow the Add app authentication tutorial and set it up against the deployed web app. This will add an extra layer of security to your data.
You can then limit access to a specific set of users or groups by following the steps from Restrict your Azure AD app to a set of users. This involves changing the "Assignment Required?" option under the Enterprise Application and assigning users/groups access.
A different take: Give Access to Azure Blob Storage
Users not granted explicit access will receive an error message stating that their administrator has configured the application to block users unless they are specifically granted access. This ensures that only authorized users can access your data.
Azure Active Directory (AAD) is the platform that enables this authentication and access control. By using AAD, you can manage user access and permissions with ease.
If this caught your attention, see: Python Access Azure Blob Storage
Automatic Index Refresh Scheduling
Automatic index refresh scheduling is a feature that allows you to keep your Azure AI Search index up-to-date with the latest data without manually updating it every time. This feature is only available when you choose Azure Blob Storage as the data source.
To enable automatic index refresh, you need to add a data source using Azure OpenAI Studio. Once you've done this, you can select the refresh cadence you want to apply under the Indexer schedule option.
The refresh cadence can be set to a specific time interval, such as daily, weekly, or monthly. However, you need to be aware that after the data ingestion is set to a cadence other than once, Azure AI Search indexers will be created with a schedule equivalent to 0.5 * the cadence specified.
Worth a look: Azure Index Search
Here's a breakdown of what this means:
This means that at the specified cadence, the indexers will pull, reprocess, and index the documents that were added or modified from the storage container. You only need to upload the additional documents from the Azure portal, and the index will pick up the files automatically after the scheduled refresh period.
For more insights, see: Google Search Pdf Documents
Data Ingestion and Storage
Data ingestion into Azure AI Search involves using integrated vectorization, a new offering that utilizes prebuilt skills for chunking and embedding input data. This update does not alter existing API contracts.
The ingestion process has undergone modifications, resulting in only three assets being created: {job-id}-index, {job-id}-datasource, and {job-id}-indexer (if a schedule is specified).
The chunks container is no longer available, as Azure AI Search now inherently manages this functionality.
Storing Vector Embeddings
You can use Azure AI Search to store vector embeddings, which includes content to be indexed and metadata about that content. This approach is useful for similarity searches and semantic searches.
Additional reading: Content Marketing and Search Engine Optimization
The AzureSearchService.QueryDocumentsAsync() method handles all the logic for storing and querying vector embeddings in Azure AI Search. This method is a good place to start if you want to use Azure AI Search in your application.
Azure AI Search is not a cheap solution, so be sure to pay attention to the pricing. It's essential to consider the costs if you plan to turn on all the features.
The Azure SearchClient library is used in the approach.py file to perform searches on the Azure Search index. This library is a crucial part of interacting with the Azure Search index.
The Document loader - prepdocslib item has more information on interacting with the Azure Search index, providing a valuable resource for developers.
Elasticsearch as Source
You can use Elasticsearch as a data source via API, which is a convenient way to integrate your existing Elasticsearch database with Azure OpenAI Studio. This feature is particularly useful for developers who already have an Elasticsearch database set up.
To get started, you'll need an Elasticsearch database and an Azure OpenAI ada002 embedding model. You can also use a MongoDB Atlas account, which is recommended for this purpose.
If you're planning to use MongoDB Atlas, we recommend choosing one of the following Azure OpenAI models: gpt-4 (0613), gpt-4 (turbo-2024-04-09), gpt-4o (2024-05-13), or gpt-35-turbo (1106).
Chat Service
The Chat Service is a crucial component of the Azure Search OpenAI demo, and it's what makes the chat workflow possible.
This service uses the Semantic Kernel to provide the chat workflow with the OpenAI API.
The kernel setup is in the constructor of the service, which is where the magic happens.
The ReplyAsync() method is where the bulk of the service's logic resides, and it's what drives the RAG pattern for the Chat page in the application.
This method gets the embeddings for the user's question, queries the search index, builds the prompt to send to OpenAI, and if the setting is turned on for providing follow-up questions, it makes a second call to OpenAI to get those.
Once it has all the content from search (plus urls for the related documents in blob storage), responses from OpenAI, it returns all the relevant data to the UI.
The service is still under active development, so things might have changed since the author last updated the repo on February 14th, 2024.
Content Retrieval and Analysis
Content retrieval in Azure AI Search is a crucial step in the RAG pattern, where queries and responses are coordinated between the search engine and the Large Language Model (LLM). This process involves forwarding a user's question or query to both the search engine and the LLM as a prompt.
The search engine returns search results, which are then redirected to the LLM. The response that makes it back to the user is generative AI, either a summation or answer from the LLM. There's no query type in Azure AI Search that composes new answers, only the LLM provides generative AI.
For more insights, see: Azure Cognative Search
Azure AI Search offers various query features to formulate queries, including simple or full Lucene syntax, filters and facets, semantic ranker, vector search, and hybrid search. These features can be used to add precision to queries and improve relevance tuning.
Here are the capabilities in Azure AI Search that are used to formulate queries:
Retrievethenread.py
Retrievethenread.py is a simple approach that gets the job done. It retrieves the embeddings for the user question, calls search to get the top number of matches, and then builds a prompt to call OpenAI to provide an answer to the user. Only a single call to OpenAI is needed.
This approach is efficient and straightforward. To remove everything, you can run azd down --purge, which will remove all resources and ask you to verify final deletion. This is the best way to remove everything, as it will also remove resources that are not visible in the Azure portal.
You can also delete the resource group, but be aware that Azure OpenAI and Azure AI Document Intelligence do a soft delete when you do that. If you want to purge those resources, you'll need to manually do it.
Recommended read: Remove Safe Search
Content Retrieval
Content retrieval is a crucial step in the content analysis process. It involves fetching relevant data from a search index.
Azure AI Search provides several query features to formulate queries, including simple or full Lucene syntax, filters and facets, semantic ranker, vector search, and hybrid search.
These features enable you to execute queries over text and nonvector numeric content, reduce the search surface area, re-rank results using semantic models, and combine multiple query techniques.
Here's a breakdown of the query features in Azure AI Search:
To remove all resources, including Azure OpenAI and Azure AI Document Intelligence, the best way is to run `azd down --purge`, which will remove all resources and ask you to verify final deletion.
Chunking Technique
The chunking technique is a sophisticated method used to break down content into manageable sections. This technique is particularly useful in content retrieval and analysis, where large amounts of text need to be processed efficiently.
The chunking technique takes into account several factors, including the maximum length of the chunk, sentence limit, and section overlap. These considerations help ensure that the content is chunked in a way that preserves its context and meaning.
Chunking is also effective in dealing with HTML tables, which can be challenging to process due to their complex structure. By using the chunking technique, you can try to keep the context together, even when dealing with tables.
The AzureSearchEmbedService.CreateSections() method is an example of an advanced chunking technique, which goes beyond the basic chunking methods used in demos #1 and #2. This method has been specifically designed to handle complex content and provide accurate results.
Token Usage Estimation
Token usage estimation is crucial for any content retrieval and analysis solution. The retrievethenread.py approach, for instance, makes a single call to OpenAI, which means you'll need to consider the token limit for that API.
To give you a better idea, here's a rough estimate of token usage for different Azure AI Search approaches:
Keep in mind that these are rough estimates, and actual token usage may vary depending on your specific implementation. It's essential to monitor and adjust your token usage to avoid hitting the limit.
The custom RAG pattern for Azure AI Search involves sending the user question to Azure AI Search to find relevant information, which can impact token usage. You'll need to consider the number of search results returned and the complexity of the query logic.
Azure AI Search provides a flexible indexing system, allowing you to store both vector and non-vector content. This flexibility can help you optimize token usage, but it also means you'll need to carefully plan your indexing strategy.
By understanding token usage estimation and planning your implementation accordingly, you can build a content retrieval and analysis solution that meets your needs without hitting token limits.
A fresh viewpoint: Azure Cognitive Search Vector
Custom Rag Pattern
In the Azure Search OpenAI demo, a custom rag pattern is used to create a unique and personalized search experience.
This pattern is designed to retrieve relevant results from a large dataset, such as a collection of images or text documents.
The custom rag pattern is a key component of the demo, allowing users to search for specific items within a large dataset.
It's a powerful tool for developers who want to create tailored search experiences for their users.
By using a custom rag pattern, developers can improve the accuracy and relevance of search results, making it easier for users to find what they're looking for.
In the demo, the custom rag pattern is used to search through a large collection of images, retrieving results that match the user's query.
The results are then displayed in a visually appealing way, making it easy for users to browse through the search results.
Developers can replicate this functionality in their own projects by using the Azure Search API and OpenAI's natural language processing capabilities.
For another approach, see: Create Azure Storage Account
Cost and Limitations
Pricing varies per region and usage, so it's hard to predict exact costs for your usage. You can try the Azure pricing calculator for the resources below.
Azure App Service costs per hour, and you can reduce costs by switching to the free SKU, but be aware that the free version only analyzes the first 2 pages of each document. Form Recognizer's free SKU also has limitations, and you can reduce costs by reducing the number of documents or removing the postprovision hook.
To avoid unnecessary costs, remember to take down your app if it's no longer in use, either by deleting the resource group in the Portal or running azd down.
Recommended read: Free Email Search Website
Cost Estimation
Cost estimation can be a challenge, especially when working with cloud services like Azure. Pricing varies per region and usage, making it difficult to predict exact costs.
To get an idea of the costs involved, you can try the Azure pricing calculator for the resources you're using. For example, Azure App Service costs per hour, while Azure OpenAI costs per 1,000 tokens used, with at least 1,000 tokens used per question.
Here's an interesting read: Affordable Search Engine Optimisation
Azure Form Recognizer, on the other hand, costs per document page, with sample documents having a total of 261 pages. Azure Cognitive Search costs per hour, while Azure Blob Storage costs per storage and read operations. Azure Monitor costs based on data ingested.
Here's a breakdown of the costs for each resource:
- Azure App Service: Basic Tier with 1 CPU core, 1.75 GB RAM. Pricing per hour.
- Azure OpenAI: Standard tier, ChatGPT and Ada models. Pricing per 1K tokens used, and at least 1K tokens are used per question.
- Form Recognizer: SO (Standard) tier using pre-built layout. Pricing per document page, sample documents have 261 pages total.
- Azure Cognitive Search: Standard tier, 1 replica, free level of semantic search. Pricing per hour.
- Azure Blob Storage: Standard tier with ZRS (Zone-redundant storage). Pricing per storage and read operations.
- Azure Monitor: Pay-as-you-go tier. Costs based on data ingested.
To reduce costs, consider switching to free SKUs for Azure App Service and Form Recognizer. This can be done by changing the parameters file under the infra folder. However, keep in mind that there are some limits to consider, such as the free Form Recognizer resource only analyzing the first 2 pages of each document.
Related reading: How to Put Your Website on Google Search for Free
Limitations
When working with Azure Cosmos DB for MongoDB, you should be aware of some limitations that might impact your project.
Only vCore-based Azure Cosmos DB for MongoDB is supported, so if you're planning to use a different type, you'll need to explore alternative options.
The search type is limited to Integrated Vector Database in Azure Cosmos DB for MongoDB with an Azure OpenAI embedding model. This is a specific feature, and you should consider whether it fits your needs before proceeding.
This implementation works best on unstructured and spatial data. If you're working with other types of data, you might need to adjust your approach.
Deployment and Management
To deploy the Azure Search OpenAI demo from scratch, you'll need to run the command `azd up`, which will provision Azure resources and deploy the sample to those resources, including building the search index based on the files found in the ./data folder.
This process can take around 5-10 minutes to complete, so be patient and don't refresh the page too quickly.
Once the application is deployed, you'll see a URL printed to the console, which you can click to interact with the application in your browser. It will look like the welcome screen or an error page, but just wait a bit and refresh the page to see the application in action.
You might enjoy: On Page Search Engine Optimisation
User Interface and Experience
The user interface of the Azure Search OpenAI demo is surprisingly intuitive. You can access it by navigating to the Azure WebApp deployed by azd, or by running it locally at 127.0.0.1:50505.
Once you're in, you'll see a few sample question buttons to try out the application. Clicking on these buttons will give you a nice answer back with a citation.
The Thought Process panel is also worth exploring, as it shows a nicely styled panel with information about the flow of the application.
The Supporting Content tab displays a nicely formatted list of the text chunks found in the retrieval step.
You can toggle a settings panel by clicking on the Developer Settings button in the upper right corner, which allows you to change certain settings.
Interestingly, the backend has different RAG approaches between the Chat side and the Ask side, so it's worth asking questions and viewing the Thought Process for each side to see the different results.
Here are some key features of the user interface:
- Sample question buttons to try out the application
- Thought Process panel to see the flow of the application
- Supporting Content tab to view text chunks
- Developer Settings panel to change settings
Resources
If you're interested in exploring the capabilities of Azure Search and OpenAI, here are some resources to get you started:
Azure Cognitive Search is a powerful tool for building scalable and secure search solutions, and it's a key component of the Azure OpenAI Service.
To learn more about Azure OpenAI, check out the comparison between Azure OpenAI and OpenAI to see which one suits your needs.
Revolutionize your Enterprise Data with ChatGPT, a next-gen app that leverages Azure OpenAI and Cognitive Search to unlock new insights and possibilities.
You can explore the following resources for more information:
- Azure OpenAI Service
- Azure Cognitive Search
- Comparing Azure OpenAI and OpenAI
- Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and Cognitive Search
Sources
- azure-search-openai-demo-java (github.com)
- azure-search-openai-javascript (github.com)
- azure-search-openai-demo (github.com)
- azure-search-openai-demo-csharp (github.com)
- /Pages/Index.razor (github.com)
- /Pages/VoiceChat.razor (github.com)
- /Pages/Docs.razor (github.com)
- ReadRetreiveReadChatService (github.com)
- AzureBlobStorageService (github.com)
- AzureSearchService (github.com)
- AzureSearchEmbedService (github.com)
- PrepareDocs (github.com)
- EmbedFunctions (github.com)
- Revolutionize your Enterprise Data with ChatGPT: Next-gen Apps w/ Azure OpenAI and Cognitive Search (aka.ms)
- Azure OpenAI Studio (azure.com)
- data preparation script (github.com)
- GitHub (github.com)
- Java (aka.ms)
- JavaScript (aka.ms)
- integrates with Azure AI Search (langchain.com)
- Azure-Samples/azure-search-openai-demo (github.com)
- Notebooks in the demo repository (github.com)
- https://github.com/Azure-Samples/azure-search-openai-demo (github.com)
- retrievethenread.py (github.com)
- chatreadretrieveread.py (github.com)
- chatapproach.py (github.com)
- approach.py (github.com)
- app.py (github.com)
- /scripts/prepdocslib/textsplitter.py (github.com)
- /scripts/prepdocslib (github.com)
- data_ingestion.md (github.com)
- Additional Documentation (github.com)
Featured Images: pexels.com