Business Central Export to Data Lake Process Explained

Author

Reads 566

Low angle of high business center with glass mirrored windows near concrete office towers in downtown
Credit: pexels.com, Low angle of high business center with glass mirrored windows near concrete office towers in downtown

The Business Central export to data lake process is a straightforward process that can be completed in a few simple steps.

First, you need to set up a data lake in Azure Data Lake Storage (ADLS) Gen2, which can be done through the Azure portal.

To export data from Business Central to the data lake, you'll need to create a data export job in Business Central. This job will define the data to be exported and the destination in the data lake.

The data export job will then run periodically, exporting the data from Business Central to the data lake.

Exporting to Data Lake

Exporting to Data Lake is a powerful feature in Business Central that allows you to export data to Azure Data Lake Storage with minimal effort. You can export data in Delta Parquet format, which supports ACID transactions and schema evolution, making it ideal for big data scenarios.

If you're looking for a more integrated and streamlined experience, you can export directly to OneLake, which is part of Microsoft Fabric. This method minimizes the need for downstream applications to manage data conversions, offering a more seamless experience.

Credit: youtube.com, Export to Azure Data Lake Overview - TechTalk

The export process in Business Central makes incremental updates to the data lake, based on the amount of changes made since the last run. It's essential to select the fields that should be exported explicitly and judiciously, as certain fields like BLOB, Flow, and Filter fields are not supported.

Here are some key things to keep in mind when exporting data:

  • BLOB, Flow, and Filter fields, as well as fields that have been Obsoleted or disabled, are not supported.
  • Fields with Access Properties other than Public will not be exported.
  • Records created before the time when the SystemCreatedAt audit field was introduced may have the field set to null.

Export to OneLake

Exporting to OneLake offers a seamless experience, designed to minimize data conversion management for downstream applications.

OneLake is part of Microsoft Fabric, making it an integrated and streamlined option for exporting data.

This method is ideal for those who want to simplify their data management process and reduce the need for manual conversions.

By exporting directly to OneLake, you can avoid the hassle of data conversions and focus on more important tasks.

Export to Lake

You can export data to Azure Data Lake in several ways, including Delta load, Delta Parquet format, and OneLake with Microsoft Fabric.

Credit: youtube.com, DFO365 Feature Export to Data Lake is Deprecated - What's next?

Delta load from BC into Azure Data Lake is a straightforward process, where you can check your data lake for files to verify the export is working. The code to filter rows is #"Filtered Rows" = Table.SelectRows(#"Removed Duplicates", each ([#"SystemModifiedAt-2000000003"] <> null)).

Exporting data in Delta Parquet format is a good option for better performance and scalability, as it supports ACID transactions and schema evolution.

OneLake with Microsoft Fabric is the most integrated and streamlined option, designed to minimize the need for downstream applications to manage data conversions.

To export data from BC, you'll need to open the Page 82560 - Export to Azure Data Lake Storage and add some tables that should be exported at the bottom grid of the page. Be sure to explicitly select the fields in the table that should be exported.

Here are some important notes to keep in mind when exporting data from BC:

  • BLOB, Flow, and Filter fields, as well as fields that have been Obsoleted or disabled, are not supported.
  • Records created before the time when the SystemCreatedAt audit field was introduced have the field set to null.
  • When exporting, there is an artificial value of 01 Jan 1900 set on the field notwithstanding the timezone of the deployment.

In version 21, a new feature called Report read-only data access needs to be enabled to minimize the performance impact on the "normal" ERP operations.

Azure Data Lake

Credit: youtube.com, Export to Azure Data Lake Overview - TechTalk

Azure Data Lake is a powerful tool for storing and managing large amounts of data. You can export data from Business Central into Azure Data Lake using the "Delta load from BC into Azure Data Lake" method.

To verify the export is working, check your data lake for the files. The data is filtered to remove duplicates and null values.

Exporting data in Delta Parquet format is another option that offers better performance and scalability. Delta files support ACID transactions and schema evolution, making them more robust for big data scenarios.

You can consume the resulting CDM data using Power BI. Simply create a new Power BI report, select Get data, and then select Azure Data Lake Storage Gen2 and Connect.

To connect to Azure Data Lake Storage Gen2, enter your Data Lake Storage endpoint, which can be found on the Endpoints blade of the Storage Account resource in the Azure Portal. Select CDM Folder View (Beta) and OK.

Credit: youtube.com, 20220426 - Sync your Business Central data with Azure Data Lake

The data-manifest database icon allows you to select which tables to load. This is a convenient feature that simplifies the data consumption process.

Consuming the shared metadata tables is another option. You can access your tables using Spark or Serverless SQL in Azure Synapse Analytics. This allows you to connect in Import mode or DirectQuery mode.

Integration and Setup

To set up the export to Azure Data Lake, you'll need to create an Azure Storage Account, which can be done by following the steps outlined in the Azure documentation.

You'll also need to create a Service Principal account with the Blob Storage Data Contributor RBAC role for the Data Lake, which involves setting up an Azure Service Principal and adding a role assignment in the Storage Account.

To configure the export in Business Central, you'll need to install the ADLS extension, which involves setting up a Business Central Sandbox or Prod environment, preparing Visual Studio Code, and building and installing the extension. Once installed, you can add the necessary information, such as Tenant, Storage Account, Container, App ID, and Secret value, and choose the fields to export for each table.

Credit: youtube.com, Export Azure Data Lake in D365- 2022 P1

To schedule a recurring export job, you'll need to click "Schedule export" inside the extension config, choose the start date, and configure the job for recurrence. This will result in the creation of a delta and a data manifest file, a "deltas" folder with .csv files, and a cdm.json file for each exported table.

Export to CSV

Export to CSV is a straightforward option that gets the job done. It's the most familiar to those of us used to the old Export to Data Lake feature.

This option exports your D365 F&O data to Azure Data Lake in CSV format within your Azure subscription. You can rely on this method for its simplicity and ease of use.

The CSV format is a widely accepted standard, making it easy to work with your data in various applications. It's a great choice for those who want a hassle-free export process.

You'll need to be aware that this option is a BYOL (Bring Your Own License) CSV, which means you'll need to have the necessary licenses in place before proceeding. It's essential to check your licensing agreement to ensure you're compliant.

Consuming Shared Tables

Credit: youtube.com, How to Configure Shared Tables in a Data Source

You can access your tables using Spark or Serverless SQL in Azure Synapse Analytics if you've configured bc2adls to create shared metadata tables for your exported entities.

With bc2adls, you can connect other consumers like Power BI through the Serverless SQL endpoint of your Synapse workspace, allowing you to connect in Import mode or DirectQuery mode.

The lake database can be found in your workspace's Data section, where you can expand it to see the shared metadata tables it contains.

From here, you can directly load a table into a SQL script or a Spark notebook to start working with your data.

Setting It Up

To set up the connection in Business Central, you'll need a few key components. You'll need an Azure Storage Account with Data Lake enabled, which is a Storage Account gen2 with hierarchical namespace enabled.

You'll also need a Service Principal account with the Blob Storage Data Contributor RBAC role for the Data Lake. To create this, follow these steps:

Credit: youtube.com, Getting Started w/ Airbyte! | Open Source Data Integration

1. Set up an Azure Storage Account by following the link to Microsoft's documentation.

2. Create an Azure Service Principal (App registration) and save the Application (client ID), Directory (tenant) ID, and Secret value.

3. Add a role assignment in the Storage Account to the Service Principal account for the Data Blob Storage Contributor role.

Here's a quick rundown of what you'll need to save:

  • Storage account name
  • Application (client ID)
  • Directory (tenant) ID
  • Secret value

Integration Pipeline

To run the integration pipeline, you'll need to invoke the Consolidation_AllEntities pipeline after one or more export processes from BC have completed.

This pipeline consolidates all incremental updates made from BC into one view, but you'll need to specify the containerName, deleteDeltas, and sparkpoolName parameters.

The containerName parameter is the name of the data lake container to which the data has been exported.

You'll also need to decide whether to delete the deltas, which is controlled by the deleteDeltas flag. It's generally a good idea to set this to true, unless you want to debug or troubleshoot the pipeline.

Credit: youtube.com, CI CD Pipeline Explained in 2 minutes With Animation!

If you want to create shared metadata tables, you can specify the name of the Spark pool in the sparkpoolName parameter.

Here are the parameters you'll need to provide:

  • containerName: the name of the data lake container
  • deleteDeltas: a flag to delete the deltas
  • sparkpoolName: (optional) the name of the Spark pool (or leave blank to skip shared metadata tables)

Just remember to follow the instructions for pipeline execution and triggers in Azure Data Factory or Azure Synapse Analytics.

Katrina Sanford

Writer

Katrina Sanford is a seasoned writer with a knack for crafting compelling content on a wide range of topics. Her expertise spans the realm of important issues, where she delves into thought-provoking subjects that resonate with readers. Her ability to distill complex concepts into engaging narratives has earned her a reputation as a versatile and reliable writer.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.