Unlocking Azure Document Intelligence API Capabilities and Options

Author

Reads 592

Computer server in data center room
Credit: pexels.com, Computer server in data center room

Azure Document Intelligence API is a powerful tool that can unlock the full potential of your documents. It offers advanced capabilities to extract and analyze data from documents, making it an essential part of any document management strategy.

With Azure Document Intelligence API, you can extract data from documents, such as text, tables, and even handwritten notes. This is made possible through its pre-trained models and custom models that can be trained on your specific data.

The API also supports multiple document formats, including PDF, Word, and even handwritten notes. This versatility makes it a valuable asset for any organization that deals with documents.

By leveraging Azure Document Intelligence API, you can automate tasks, improve data accuracy, and gain valuable insights from your documents.

See what others are reading: Azure Document Management Solution

Prerequisites

To get started with Azure Document Intelligence API, you'll need to meet some prerequisites.

First and foremost, you'll need an Azure subscription. Don't worry, you can create one for free.

Credit: youtube.com, Calling Azure AI Document Intelligence using the REST API

To build and deploy your application, you'll also need the Visual Studio IDE.

Next, you'll need to create an Azure AI services or Document Intelligence resource. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

You'll also need the key and endpoint from the resource you create to connect your application to the Azure Document Intelligence service. Store the API key securely, such as in Azure Key Vault, and never post it publicly.

To test your application, you'll need a document file at a URL location. You can use the sample forms provided in the table below.

Azure Document Intelligence API Basics

Azure Document Intelligence provides SDKs in various languages, including C#, Python, Java, and JavaScript.

You can use these SDKs to extract data from tax forms, such as the example in Python that extracts data from a tax form.

The data can be modeled in a structured format like JSON or a relational database, with tables for different tax forms (e.g., W-2, 1099) and columns representing the extracted fields.

Credit: youtube.com, Getting Started With Azure Document AI Document Intelligence API In Python (Source Code In Desc)

To handle schema changes, you can store the model ID used for each document processing and periodically check if the model ID has changed.

If a change is detected, you can update your ingestion pipeline to accommodate the new schema.

Here's a basic example of how to handle schema changes in Python:

```python

# Store the model ID

model_id = get_model_id()

# Check for updates

if is_model_id_changed(model_id):

# Update schema

update_schema(model_id)

```

Creating a Resource

To create a Document Intelligence Resource, you'll need to sign in to the Azure portal. This is the first step in getting started with the Azure Document Intelligence API.

From the Azure home page, select Create a resource. You can find this option by searching for it in the search bar. The search bar is a convenient way to quickly find what you're looking for.

Once you've selected Create a resource, search for and choose Document Intelligence from the search bar. This will take you to the Document Intelligence page where you can create a new instance. After selecting the Create button, you'll be able to configure your resource.

Here's a quick summary of the steps to create a Document Intelligence Resource:

  1. Sign in to the Azure portal.
  2. Select Create a resource from the Azure home page.
  3. Search for and choose Document Intelligence from the search bar.
  4. Select the Create button.

Install Client Library

Credit: youtube.com, AZ 204 Create Blob Storage resource using .Net client library | Azure | Cloud

To install the client library, you'll need to choose a programming language. Python is a popular choice, and the library is available on PyPI.

The library can be installed using pip, which is Python's package installer. Run the command `pip install resource-client` in your terminal or command prompt.

The library requires a specific version of Python, which is 3.6 or later. This is because the library uses features that are only available in Python 3.6 and later.

Once the library is installed, you can import it into your code using `import resource_client`. This will give you access to the library's functions and classes.

The library has a simple API that makes it easy to use. For example, you can create a new resource using the `create_resource` function, which takes a dictionary of properties as an argument.

You can also use the `get_resource` function to retrieve a resource by its ID. This function takes the ID of the resource as an argument and returns a dictionary of its properties.

Creating a Resource

Credit: youtube.com, EDC Administration - Creating a Resource (Resource Management)

To start creating a Document Intelligence resource, you need to sign in to the Azure portal. From the Azure home page, select Create a resource. Search for and choose Document Intelligence from the search bar, then select the Create button.

Here are the steps to create a Document Intelligence resource:

  1. Sign in to the Azure portal.
  2. Select Create a resource from the Azure home page.
  3. Search for and choose Document Intelligence from the search bar.
  4. Select the Create button.

Alternatively, you can also create a Document Intelligence resource by following the steps outlined in Example 5: "Creating a Document Intelligence Resource:"

  1. Sign in to the Azure portal.
  2. Select Create a resource from the Azure home page.
  3. Search for and choose Document Intelligence from the search bar.
  4. Select the Create button.

Note that you can use the free pricing tier (F0) to try the service. After your resource deploys, select Go to resource to get your key and endpoint.

Data Management

Data stored in Azure Storage is deleted within 24 hours from the time you submit an analyze request. This is the case for all features of Document Intelligence, including features that don't require labels.

The interim outputs after analysis and labeling are stored in the same Azure Storage location where you store your training data. This is true for trained custom models, which are also stored in Azure Storage.

Your trained custom models are logically isolated with your Azure subscription and API credentials, ensuring secure storage.

Data Storage

Credit: youtube.com, What is Data? Data Types, Storage and Management

Data Storage is a crucial aspect of managing your data, and it's great that Document Intelligence takes care of it for you. Your data is temporarily stored in Azure Storage in the same region as the request.

Data is deleted within 24 hours from the time you submit an analyze request, so you don't have to worry about it lingering around.

For trained custom models, interim outputs after analysis and labeling are stored in the same Azure Storage location where you store your training data.

Trained custom models are stored in Azure Storage in the same region, and are logically isolated with your Azure subscription and API credentials, so your data stays secure.

Specifying Page Range

Specifying Page Range is a crucial aspect of data management, especially when dealing with multi-page documents. You can specify a range of pages to be analyzed in a document using the pages parameter, supported in v2.1, v3.0, and later versions of the REST API.

From above of United States currency folded in roll placed on USA flag illustrating concept of business profit and wealth
Credit: pexels.com, From above of United States currency folded in roll placed on USA flag illustrating concept of business profit and wealth

To specify a single page, simply enter the page number, for example, 1. This will process pages 1 and 2.

You can also specify a finite range of pages, such as 2-5, which will process pages 2 to 5. This is useful when you want to analyze a specific section of the document.

Open-ended ranges are also supported, where you can specify a starting or ending page number, such as 5- or -10. This will process all the pages from the specified number, or from the beginning to the specified number, respectively.

Here are the different types of page ranges you can specify:

  • Single pages: 1
  • Finite ranges: 2-5
  • Open-ended ranges: 5-, -10

You can mix these parameters together, and ranges can overlap, such as -5, 1, 3, 5-10, which will process pages 1 to 10. The service will accept the request if it can process at least one page of the document.

Studio Access Permissions

To access Document Intelligence Studio, you'll need an active Azure account and subscription with at least a Reader role. This will give you the necessary permissions to explore the studio.

Credit: youtube.com, SQL Server Tutorial - Using security and permissions

The role requirements for user scenarios in document analysis and prebuilt models are clearly outlined. You'll need to familiarize yourself with Microsoft Entra built-in roles and Azure role assignments to ensure you have the correct permissions.

Having the right permissions will make a big difference in your experience with Document Intelligence Studio. You'll be able to access the features and tools you need without any issues.

Training and Improvement

Training a custom model in Azure Document Intelligence API is a straightforward process. You can train a model with your own data, and after training, you can test, retrain, and use it to reliably extract data from more forms according to your needs.

To train a custom model, you can use the Document Intelligence Sample Labeling tool, which provides a graphical user interface to help you label your training documents. This can lead to better performance in some scenarios.

However, training with labels requires special label information files (.pdf.labels.json) in your blob storage container alongside the training documents. These files can be created using the Document Intelligence Sample Labeling tool.

Credit: youtube.com, Using Azure AI Document Intelligence to Accelerate Data Ingestion and Extraction

Training a model without labels is also possible, and it can analyze all the fields and values found in your custom forms without manual labeling. The returned CustomFormModel object contains information on the form types the model can analyze and the fields it can extract from each form type.

You can also train a model on a given set of documents and print the model's status to the console. This is useful for testing and debugging purposes.

To improve the results of your custom model, you should examine the "confidence" values for each key/value result under the "pageResults" node. You should also look at the confidence scores in the "readResults" node, which correspond to the text read operation.

Here are some tips to improve your results:

  • If the confidence scores for the read operation are low, try to improve the quality of your input documents.
  • If the confidence scores for the key/value extraction operation are low, ensure that the documents being analyzed are of the same type as documents used in the training set.

The confidence scores you target depend on your use case, but generally, it's a good practice to target a score of 80 percent or higher. For more sensitive cases, like reading medical records or billing statements, a score of 100 percent is recommended.

Credit: youtube.com, Azure AI Document Intelligence Platform Walkthrough

You can use the List Custom Models API to return a list of all the custom models that belong to your subscription. This is useful for managing and maintaining your custom models.

Each train operation generates a new model, so you can't retrain a custom model in the classical sense. However, you can add more samples to your training dataset and train a new model, or create a new model to compose with your original model.

Troubleshooting and Next Steps

Troubleshooting with the Azure Document Intelligence client library can be a bit tricky, but errors are returned as RequestFailedException with the same HTTP status code as a REST API request would return.

If you submit a receipt image with an invalid URI, a 400 error is returned, indicating Bad Request. This is just one example of how the service handles errors.

To get started with Document Intelligence, complete a quickstart using one of the available SDKs, including C#, Python, Java, JavaScript, or REST API.

If this caught your attention, see: Azure Data Factory Rest Api

Troubleshooting

Credit: youtube.com, Top 5 Troubleshooting Steps in I.T. - Information Technology

Troubleshooting is an essential part of working with the Azure AI Document Intelligence client library. Errors returned by the service result in a RequestFailedException, which includes the same HTTP status code that a REST API request would return.

For example, if you submit a receipt image with an invalid URI, a 400 error is returned, indicating Bad Request. This can be frustrating, but it's a clear indication that something went wrong.

The Document Intelligence client library raises exceptions defined in Azure Core, which can help you identify and fix issues quickly. This includes exceptions for things like invalid URIs, which can be easily fixed by double-checking your code.

You can use the FormRecognizerClient to recognize form fields and content, and the FormTrainingClient to create and manage custom models. The FormRecognizerClient provides operations for recognizing form fields and content, including tables, lines, and words.

Here are some common tasks you can perform with the FormRecognizerClient:

  • Recognize form fields and content by using custom models trained to analyze your custom forms.
  • Recognize form content, including tables, lines, and words, without the need to train a model.
  • Recognize common fields from US receipts, business cards, invoices, and ID documents using a pretrained model on the Document Intelligence service.

The FormTrainingClient provides operations for training custom models to analyze all fields and values found in your custom forms, as well as managing models created in your account.

Next Steps

Credit: youtube.com, Module 8.2: Open Problems and Next Steps

Now that you've learned the basics of Document Intelligence, it's time to take your skills to the next level. You can explore the Document Intelligence Studio and reference documentation to learn more about the service and its capabilities.

The Document Intelligence Studio is a great place to start, where you can try out different features and see how they work. You can also explore the Document Intelligence REST API, which allows you to interact with the service programmatically.

To get started, make sure you have an Azure subscription, which you can create for free. You'll also need the Visual Studio IDE or current version of .NET Core, as well as an Azure Storage blob that contains a set of training data.

Here are the APIs you can use to extract structured data from forms and documents:

  • Authenticate the client
  • Analyze Layout
  • Analyze receipts
  • Analyze business cards
  • Analyze invoices
  • Analyze ID documents
  • Train a custom model
  • Analyze forms with a custom model
  • Manage custom models

Once you have everything set up, you can start building your application using the Document Intelligence client library for .NET. Remember to install the library using the `dotnet add package` command, and to create variables for your resource's key and endpoint in your application's Program class.

Don't forget to remove the key from your code when you're done, and to use secure methods to store and access your credentials in production.

Readers also liked: Azure Libraries

Frequently Asked Questions

What is document intelligence in Azure?

Document Intelligence in Azure is an AI-powered service that automatically extracts text, data, and structures from documents, turning them into usable information. This enables you to focus on insights and action, rather than manual data compilation.

What is the equivalent of Document AI in Azure?

Azure AI Document Intelligence is equivalent to Document AI in Azure, offering cloud-based machine learning to automate data processing and enhance document search capabilities. Discover how to leverage this service to boost your data-driven strategies.

What is Azure Cognitive Services API?

Azure Cognitive Services API is a set of cloud-based AI tools that provide pre-trained models for easy integration into applications. These APIs enable developers to build AI-powered applications without extensive training or data requirements.

Tanya Hodkiewicz

Junior Assigning Editor

Tanya Hodkiewicz is a seasoned Assigning Editor with a keen eye for compelling content. With a proven track record of commissioning articles that captivate and inform, Tanya has established herself as a trusted voice in the industry. Her expertise spans a range of categories, including "Important" pieces that tackle complex, timely topics and "Decade in Review" features that offer insightful retrospectives on significant events.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.