
Azure Synapse Link for Dataverse Integration Essentials is a game-changer for businesses looking to unlock the full potential of their data.
It allows for seamless integration between Azure Synapse Analytics and Dataverse, enabling you to bring together data from various sources and create a unified view of your business.
This integration is made possible through a bi-directional data pipeline that replicates data from Dataverse to Azure Synapse Analytics.
With this integration, you can now analyze and visualize your data in real-time, making it easier to make informed business decisions.
Azure Synapse Link for Dataverse Integration Essentials is designed to be easy to use, with a simple and intuitive setup process that requires minimal technical expertise.
It's also highly scalable, allowing you to handle large volumes of data and support growing business needs.
Worth a look: Azure Dataverse
Azure Synapse Link for Dataverse
Azure Synapse Link for Dataverse is a powerful tool that enables seamless data integration and analytics capabilities. It's designed to empower organizations to leverage their data assets efficiently.
To use Azure Synapse Link for Dataverse, you'll need to have a finance and operations sandbox (Tier-2) or higher environment, and your environments must be version 10.0.36 (PU 60) cumulative update 7.0.7036.133 or later. You'll also need to link your finance and operations environment with Microsoft Power Platform and enable Sql row version change tracking configuration key.
Azure Synapse Link for Dataverse offers several key features, including integration, simplified data access, unified analytics, performance optimizations, cost-effective data storage, and scalability. These features enable organizations to leverage their data assets efficiently and make informed decisions.
Here are some common scenarios for using Azure Synapse Link for Dataverse:
Prerequisites
To set up Azure Synapse Link for Dataverse, you'll need to meet certain prerequisites. You must have an Azure Data Lake Storage Gen2 account and the Owner and Storage Blob Data Contributor role access. Your storage account must enable Hierarchical namespace for both initial setup and delta sync.
You'll also need to create an Azure Synapse Link with managed identities, which requires using managed identities for Azure with your Azure data lake storage. If you don't have managed identities set up, you'll need to enable public network access for Azure resources for both initial setup and delta sync.
In addition, you'll need to have Reader role access to the resource group with the storage account. To link the environment to Azure Data Lake Storage Gen2, you'll need to have the Dataverse system administrator security role.
Here's a summary of the prerequisites:
Only tables that have change tracking enabled can be exported. It's also worth noting that the creation of Azure Synapse Link profiles under a single Dataverse environment is limited to a maximum of 10.
Exploring Key Capabilities
Azure Synapse Link for Dataverse offers a range of powerful capabilities that empower organizations to integrate and analyze data efficiently. One of the key features is flexible entity and table selection, allowing users to choose from a comprehensive selection of both standard and custom Finance and Operations entities and tables.
This flexibility enables organizations to tailor their data integration and analytics workflows to their specific needs. By selecting the entities and tables that are most relevant to their business, users can create a more accurate and comprehensive picture of their data.
Continuous data replication is another key capability of Synapse Link for Dataverse. This feature ensures that data is continuously replicated, supporting Create, Update, and Delete (CUD) transactions to maintain data accuracy and timeliness.
Streamlined environment linkage is also a key benefit of Synapse Link for Dataverse. Users can link or unlink their environment to Azure Synapse Analytics and/or Data Lake Storage Gen2 directly within their Azure subscription, making it easy to integrate with other Azure services.
Integrated data exploration is another powerful feature of Synapse Link for Dataverse. Users can explore and analyze selected data using Azure Synapse without the need for external configuration tools.
Here are some of the key features of Synapse Link for Dataverse, grouped by category:
These capabilities empower organizations to derive valuable insights and make informed decisions, driving enhanced operational efficiency and business outcomes.
Property Changes
Property Changes are a significant aspect of Azure Synapse Link for Dataverse.
Developers can enable change tracking on base tables through an extension, while custom tables can have this change enabled via customization.
Custom tables may require manual adjustment in older application versions.
Synapse Link for Dataverse utilizes the "FISroversion" field and Rec ID to track changes upon table activation in the Maker's portal.
This streamlines change tracking and synchronization processes effectively.
The "FISroversion" field is introduced alongside the Rec ID in the table.
Responsibility rests with the licensee to ensure that their use of the material does not violate any other rights.
Custom Tables Requirement
Custom tables in Azure Synapse Link for Dataverse require some special attention. You'll need to set a property to "yes" for custom tables, ISV tables, data entities, and any base tables where the property is not enabled. This is a crucial step to ensure these tables appear in Synapse Link.
This small property change is required to integrate your custom and specific data tables into the Synapse Link environment. It's a simple adjustment, but one that's essential for a seamless data integration experience.
In the latest update to Synapse Link for Dataverse (version 10.0.38), Microsoft has expanded the availability of change tracking to include 2750 base tables by default. However, custom tables are still subject to this requirement.
To summarize, here are the custom tables that require special attention:
By following these guidelines, you'll be able to successfully integrate your custom data tables into the Synapse Link environment.
What Is Microsoft?
Microsoft is a company that offers a cloud-based storage service called Dataverse, which enables customers to securely store and manage business application data.
Dataverse is fully managed and maintained by Microsoft, so you don't have to worry about upkeep or maintenance.
Dataverse comes with a standard set of tables that cover typical business scenarios, but you can also create custom tables specific to your organization.
Dynamics 365 applications, such as Dynamics 365 Sales, Customer Service, or Talent, use Dataverse to store and secure the data they use.
Storage Gen2
Azure Synapse Link for Dataverse allows you to connect Dataverse to Azure Data Lake Storage Gen2, which provides a secure and scalable storage solution for your data.
To connect Dataverse to Azure Data Lake Storage Gen2, you need to sign in to Power Apps, select your preferred environment, and then select Azure Synapse Link. This will allow you to link your environment to a data lake and grant the Azure Synapse Link service access to your storage account.
Azure Data Lake Storage Gen2 encrypts your data at rest using Transport Layer Security (TLS) 1.2 or higher, providing an additional layer of security and compliance with regulatory requirements.
Data exported by Azure Synapse Link service is encrypted at transit and at rest, ensuring the security and integrity of your data. To view your data in Azure Data Lake Storage Gen2, select the desired Azure Synapse Link and then select Go to Azure data lake from the top panel.
Here are the key requirements for Azure Data Lake Storage Gen2:
- Subscription
- Resource group
- Storage account
Ensure that your storage account meets the requirements specified in the Prerequisites section, and grant yourself an owner role on the storage account.
View Storage Gen2
Viewing your data in Azure Data Lake Storage Gen2 is a straightforward process. You can access your data by selecting the desired Azure Synapse Link and then clicking "Go to Azure data lake" from the top panel.
To view the data, you'll need to expand the File Systems section and select the dataverse-environmentName-organizationUniqueName folder. This folder contains a model.json file that provides a list of tables that have been exported to the data lake, along with their names and versions. The model.json file also includes the initial sync status and sync completion time.
A folder that includes snapshot comma-delimited (CSV format) files is displayed for each table exported to the data lake. You can view these files by selecting the desired table and accessing the corresponding folder.
To refresh the data, append ?athena.updateLake=true to the web address that ends with exporttodatalake in your browser's address bar. This will update the data and reflect any changes made to the tables.
Here's a step-by-step guide to refreshing the data:
- Select an existing profile from the Azure Synapse Link area.
- Select the extended option.
- Select Link to Azure Synapse Analytics Workspace and allow a few minutes for everything to be linked.
Incremental Update Folder Structure
The Incremental Update Folder Structure is a feature that creates a series of timestamped folders containing only the changes to the dataverse data that occurred during a user-specified time interval.
This feature is currently available in preview and can be enabled to create incremental update folders every 15 minutes, capturing changes that occurred within that time interval.
Each folder is written in the Common Data Model format and contains yearly partitioned CSV files that are set to append-only mode.
A separate timestamped folder is created for the updates that occur within a user-specified time interval, and each folder is only created when there is a data update within that interval.
The top-level folder structure changes to timestamp folders that are aligned to the user-specified time interval, with new timestamp folders created every 15 minutes.
Here's an overview of the process:
- Timestamp folder created (e.g. 2022-07-28T10.00.00Z).
- Change occurs at source (e.g. update a row).
- The first change that is detected causes the entity sub-folder to be created (e.g. account) and the CSV file to be initialised (e.g. 2022.csv).
- The data is appended to the end of the CSV file.
- Steps 2 and 4 (change at source AND data being appended) continue to repeat until the time interval closes (e.g. 15 minutes).
- Once the next time interval arrives, the next timestamp folder is created (e.g. 2022-07-28T10.15.00Z).
Note that even though the next timestamp folder has been created, it will remain empty of entity folders and CSV files until the next change is detected.
Continuous Updates
Continuous updates are a key feature of Azure Synapse Link for Dataverse, allowing data to be continuously updated in the data lake.
Snapshots are updated every hour, ensuring that data is always up-to-date and reliable.
New snapshot files are created only if there is a data update, which prevents unnecessary file duplication.
Only the latest five snapshot files are retained, and stagnant data is automatically removed from the Azure Data Lake Storage Gen 2 account.
This approach allows for efficient data management and minimizes storage costs.
The trickle feed engine continuously pushes changes in Dataverse to the corresponding CSV files, enabling real-time updates.
The model.json file always points to the latest time-stamped snapshot file, ensuring that users have access to the most up-to-date data.
Users can continue to work on older snapshot files while newer viewers can read the latest updates, making it ideal for scenarios with longer-running downstream processes.
Configuring Environment
Configuring your environment for Azure Synapse Link for Dataverse is a crucial step in getting started. You'll need to perform additional configuration steps if you're using cloud-hosted environments.
First, you'll need to complete a full database synchronization (DBSync) and put your environment in maintenance mode using Visual Studio. This ensures a smooth transition to Azure Synapse Link.
To enable data synchronization, you'll need to run two specific SQL statements: DMFEnableSqlRowVersionChangeTrackingIndexing and DMFEnableCreateRecIdIndexForDataSynchronization. These statements create indexes required for data synchronization, specifically for the RecId and SysRowVersion fields.
You can enable these flights by running the SQL statements in your Tier 1 environment. In higher environments, these indexes are created automatically when enabling change tracking on a table or entity.
Here are the SQL statements you'll need to run:
- DMFEnableSqlRowVersionChangeTrackingIndexing
- DMFEnableCreateRecIdIndexForDataSynchronization
Additionally, you'll need to run a script to perform initial indexing operations in your environment. If you don't run this script in your Cloud Hosted Environment (CHE) environment, you may see an error "FnO-812" when trying to add tables to Azure Synapse Link.
Enable Finance and Operations
To enable finance and operations in Azure Synapse Link, you must first sign in to Power Apps and select the environment you want.
You can enable both finance and operations tables and finance and operations entities in Azure Synapse Link for Dataverse. This allows you to use finance and operations apps tables in Azure Synapse Link, which can't be seen in the Tables area in Power Apps.
Finance and operations apps tables are allowed only in Azure Synapse Link, and makers can't see them in the Tables area in Power Apps. To include finance and operations tables in Synapse Link, you must enable the Delta lake feature in your Synapse Link profile.
To enable finance and operations entities, you must enable finance and operations virtual entities in the Power Apps maker portal, and enable row version change tracking for Entities. Finance and operations entities start with the prefix mserp_.
To enable change tracking for finance and operations entities, select the Track changes option, and then select Save. If the option is unavailable, go to Known limitations with finance and operations entities.
Here are the steps to enable change tracking for finance and operations entities:
- In Power Apps, select Tables on the left navigation pane, and then select the table you want.
- Select Properties > Advanced options.
- Select the Track changes option, and then select Save.
By following these steps, you can enable change tracking for finance and operations entities and access incremental data changes from finance and operations.
Working with Data
Azure Synapse Link for Dataverse enables you to easily integrate your Dynamics 365 data with Azure Synapse Analytics, allowing for seamless data analysis and reporting.
You can now bring your Dynamics 365 data into Azure Synapse Analytics, creating a unified view of your business data.
With Azure Synapse Link, you can use the data in Azure Synapse Analytics to create reports, dashboards, and data visualizations that provide valuable insights into your business operations.
Folders and Files
The folders and files used to store data in the data lake are organized in a way that makes it easy to navigate and understand. Each table exported to the data lake has a corresponding snapshot folder.
The model.json file is stored in a Common Data Model folder and contains metadata about the exported data. It lists the exported entities, or tables, and describes each entity with a name, description, annotations, and attributes.
A snapshot folder is included with each table exported to the data lake, providing a read-only copy of the data updated at regular intervals, typically every hour. This ensures data consumers can reliably consume data in the lake.
Data files are partitioned by month based on the createdOn value in each row, but the partition strategy can be changed to year within the advanced configuration settings. By default, the service performs an in-place update (UPSERT) of the incremental changes, but this can be changed to append only via the advanced configuration settings.
Here's a breakdown of the snapshot folder and files:
The model.json file is updated on a regular interval to point to the relevant snapshot files, making it easy to keep track of the latest data.
Working with Metadata
Working with metadata is a crucial part of data management, and it's essential to understand its importance.
Metadata is like a map that helps you navigate and understand your data. It provides information about the data, such as who created it, when it was created, and how it's formatted.
A good example of metadata is a spreadsheet with a description of each column, making it easier to understand the data.
You can use metadata to track changes to your data, such as who made the changes and when. This is especially useful for collaborative projects where multiple people are working with the same data.
Metadata can also be used to categorize and filter data, making it easier to find specific information. For instance, you can use metadata to filter a list of customers by location.
In some cases, metadata can even be used to automatically populate fields in a database, saving you time and effort. For example, if you have a database of products, metadata can be used to automatically populate the product description field.
Metadata is not just limited to digital data; it can also be used for physical data, such as inventory management.
Access Incremental Changes
You can load incremental data changes from finance and operations into your own downstream data warehouse by creating an Azure Synapse Link profile that provides only incremental data. This feature efficiently syncs data from Dataverse via Azure Synapse Link by detecting changes since initial extraction or the last synchronization.
To create an Azure Synapse Link profile with incremental data, sign in to Power Apps and select the environment you want. Then, on the left navigation pane, select Azure Synapse Link and follow the prompts to create a new link.
Incremental data changes are stored in CSV files in timestamped folders, with each folder containing the changes that occurred during the specified time interval. For example, if the time interval is set to 15 minutes, the service will create incremental update folders every 15 minutes that will capture changes that occurred within the time interval.
Here are the steps to create an Azure Synapse Link profile with incremental data:
- Sign in to Power Apps and select the environment you want.
- On the left navigation pane, select Azure Synapse Link.
- On the Azure Synapse Link for Dataverse page, select + New link on the command bar.
- Select Subscription, Resource group and a Storage account.
- Select Next and choose the tables you want to include in the link.
- Enable the option Enable incremental update folder structure and set the time interval.
- Select Save to create the link.
Note that the incremental update folder structure feature is currently available in preview, and it creates a series of timestamped folders containing only the changes to the Dataverse data that occurred during the user-specified time interval.
Lake and Analytics
In Azure Synapse Analytics, you can analyze your Dataverse data in a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics.
The service gives you the freedom to query data on your terms, using either serverless or dedicated resources—at scale. It also allows you to build analytics solutions on top of the Apache Spark engine.
You can ingest, explore, prepare, manage, and serve data for immediate business intelligence and machine learning needs all from a single service.
To understand what has materialized within Synapse, navigate to your Synapse workspace, open the Data hub, and expand Lake database. You will find that the following has been created:
- 1 x Lake Database.
- 2 x Tables per Entity (Near Real-Time and Snapshot).
- A standard set of 5 x tables for Dataverse choice labels.
The near real-time table is backed by CSV files directly beneath the entity folder, whereas the snapshot table is backed by CSV files beneath specific folder paths.
Once Azure Synapse Link has been switched on to continuously export Microsoft Dataverse data and the corresponding metadata, users have several options in how to query and analyze the data downstream. Serverless SQL is a query service that enables users to run SQL on files placed in Azure Storage.
You can query the near real-time data by right-clicking on the table, and selecting New SQL script > Select TOP 100 rows. Alternatively, you can navigate directly to the files on the data lake and use the OPENROWSET function to query the CSV file.
Here's an interesting read: Azure Sql Linked Server
The auto-generated SQL will use the OPENROWSET function to query the CSV file, but it will be missing schema information. This is one of the benefits of using the lake database tables where the metadata is maintained by the Synapse Link service.
Azure Synapse Analytics offers alternative analytics engines optimized for different workloads, including Apache Spark - a big data compute engine. You can right-click on a Synapse Link for Dataverse table and use the New notebook action to auto-generate code that will load Microsoft Dataverse data into an Apache Spark DataFrame.
For those looking for a code-free data transformation authoring experience, Data flows are visually designed and allow data engineers to develop data transformation logic without writing code. Using the Workspace DB connector, you can set the source of a data flow to a Dataverse database and table.
Intriguing read: Spark on Azure
Frequently Asked Questions
How do I link Dataverse to access?
To link Dataverse to Access, open your Access database and navigate to External Data > New Data Source > From Online Services > From Dataverse. This will guide you through the process of connecting your Access database to Dataverse data.
Sources
- https://learn.microsoft.com/en-us/power-apps/maker/data-platform/export-to-data-lake
- https://learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-data-lake
- https://stoneridgesoftware.com/how-to-set-up-synapse-link-for-dataverse-to-enhance-system-integration/
- https://learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-select-fno-data
- https://www.taygan.co/blog/2022/06/28/azure-synapse-link-for-dataverse
Featured Images: pexels.com