Teradata on Azure Data Integration and Analysis

Author

Reads 1.2K

Modern data center corridor with server racks and computer equipment. Ideal for technology and IT concepts.
Credit: pexels.com, Modern data center corridor with server racks and computer equipment. Ideal for technology and IT concepts.

Teradata on Azure Data Integration and Analysis is a powerful combination that enables seamless data integration and analysis. With Teradata on Azure, you can easily integrate your on-premises data with cloud-based data sources.

This integration is made possible through the use of Azure Data Factory, which allows for the creation of data pipelines that can be used to move and transform data between different sources. By leveraging Azure Data Factory, you can simplify your data integration process and reduce the complexity associated with integrating different data sources.

Teradata on Azure also provides advanced analytics capabilities through the use of Azure Machine Learning and Azure Databricks. These tools enable you to build and deploy machine learning models and perform advanced data analytics on your integrated data.

Teradata on Azure Setup

To set up Teradata on Azure, you'll need to create a new Azure Virtual Network (VNet) and a subnet for your Teradata instance.

Credit: youtube.com, Deploying Teradata Database Developer Tier on Azure

Teradata recommends a minimum of 4 vCPUs and 16 GB of memory for optimal performance.

This will allow your Teradata instance to handle large amounts of data and user traffic.

To deploy Teradata on Azure, you can use the Azure Marketplace, which offers a pre-configured Teradata image.

This image includes all the necessary software and configuration settings for a successful deployment.

The Azure Marketplace image also includes a script that automates the Teradata installation process.

This script can be customized to fit your specific needs and requirements.

You'll also need to create a storage account for your Teradata data, which can be done through the Azure Portal.

Teradata supports both Azure Blob Storage and Azure Data Lake Storage Gen2.

Make sure to select the correct storage option based on your specific use case and requirements.

Data Transfer and Loading

Loading data from Teradata into Azure is a breeze, thanks to the CData JDBC Driver. You can load Teradata data as a dataframe using the connection information.

Credit: youtube.com, How to Load Data from Microsoft Azure Blob Storage to Teradata Vantage - Foreign Table

To load large amounts of data, it's recommended to enable parallel copy with data partitioning. This allows the service to run parallel queries against your Teradata source to load data by partitions. The parallel degree is controlled by the parallelCopies setting on the copy activity.

The Teradata connector provides built-in data partitioning to copy data from Teradata in parallel. You can partition data using the Hash option, which automatically detects the primary index column and applies a hash against it. This is especially useful for full loads from large tables.

Alternatively, you can use dynamic range partitioning, which is ideal for loading large amounts of data using a custom query. This requires specifying the partition column, upper bound, and lower bound.

Here's a summary of the suggested settings for different scenarios:

By following these settings, you can optimize your data transfer and loading process from Teradata to Azure, even with large amounts of data.

Data Configuration and Mapping

Credit: youtube.com, Azure Options for Teradata Vantage

Data Configuration and Mapping is a crucial step in setting up Teradata on Azure. You need to map Teradata's data types to the internal data types used by the service to ensure seamless data transfer.

To do this, you can refer to the data type mapping table below:

By mapping your Teradata data types correctly, you can ensure that your data is transferred accurately and efficiently to Azure.

Data Type Mapping

Data Type Mapping is a crucial aspect of data configuration and mapping. It ensures that data is accurately transferred between systems, without any loss or distortion.

In our experience, data type mapping can be a complex process, but it's essential to get it right. To simplify the process, let's take a look at the data type mappings for Teradata.

Here are some key data type mappings to keep in mind:

Some data types, like Graphic, Interval Day, and Interval Day To Hour, are not supported by the interim service, so you'll need to apply an explicit cast in your source query.

Connector Configuration Details

Credit: youtube.com, Configuration Mapping and Updates in a Data Migration

To configure the Teradata connector, you'll need to provide some essential properties. Start by selecting the source as Teradata from the Connections tab.

The Teradata connector supports the following properties: type, connectionString, username, password, connectVia, TdmstPortNumber, UseDataEncryption, CharacterSet, MaxRespSize, and MechanismName.

You can configure the connection properties by clicking Add Connection, selecting Teradata as the source, and then setting the required properties. The type property must be set to Teradata, and the connectionString property specifies the information needed to connect to the Teradata instance.

The username and password properties are optional and only required when using Windows authentication. The connectVia property specifies the Integration Runtime to be used to connect to the data store.

The TdmstPortNumber property is set to 1025 by default and should not be changed unless instructed to do so by Technical Support. The UseDataEncryption property can be set to 0 (disabled) or 1 (enabled) to specify whether to encrypt all communication with the Teradata database.

Credit: youtube.com, 17 - GRC CONFIGURATION - MAINTAIN MAPPING FOR ACTIONS AND CONNECTOR GROUPS

The CharacterSet property specifies the character set to use for the session, and the MaxRespSize property sets the maximum size of the response buffer for SQL requests.

To connect to Teradata using the JDBC URL, you'll need to reference the class for the JDBC Driver and construct a connection string. You can also use the connection string designer built into the Teradata JDBC Driver to assist with constructing the JDBC URL.

Here are the required properties for the Teradata linked service:

Dataset Properties

When working with datasets, it's essential to understand the properties that define them. The type property of the dataset must be set to TeradataTable.

To copy data from Teradata, you'll need to specify the type, which is a crucial step. The type property of the dataset must be set to TeradataTable.

The database property is used to specify the name of the Teradata instance. This property is not required if the query in the activity source is specified. The database property is a vital piece of information for connecting to the correct instance.

Credit: youtube.com, Creating a Data Map - A Vital Step to Data Migration & System Integration

The table property is used to specify the name of the table in the Teradata instance. Like the database property, this is not required if the query in the activity source is specified. The table property helps you pinpoint the exact data you want to copy.

Here's a summary of the supported properties for copying data from Teradata:

Supported Capabilities

The Teradata connector offers a range of supported capabilities that make data integration a breeze.

You can use the copy activity to transfer data from a Teradata source, and it's supported for both Azure integration runtime and self-hosted integration runtime.

The connector supports lookup activity as well, which is useful for retrieving data from a Teradata source.

Here are the specific capabilities that the Teradata connector supports:

  • Copy activity (source/-)
  • Lookup activity

These capabilities are available for Teradata versions 14.10, 15.0, 15.10, 16.0, 16.10, and 16.20.

Frequently Asked Questions

What cloud does Teradata use?

Teradata VantageCloud is hosted on public cloud service providers such as Amazon Web Services, Microsoft Azure, and Google Cloud. This allows for scalable and secure data management.

Rosemary Boyer

Writer

Rosemary Boyer is a skilled writer with a passion for crafting engaging and informative content. With a focus on technical and educational topics, she has established herself as a reliable voice in the industry. Her writing has been featured in a variety of publications, covering subjects such as CSS Precedence, where she breaks down complex concepts into clear and concise language.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.