Table of Contents

Search

  1. Preface
  2. Data Integration performance tuning overview
  3. Optimizing targets
  4. Optimizing sources
  5. Optimizing mappings
  6. Optimizing mapping tasks
  7. Optimizing advanced clusters
  8. Optimizing system performance

Data Integration Performance Tuning

Data Integration Performance Tuning

Mapping design and environment

Mapping design and environment

When you design a mapping, follow best practices to optimize mapping performance.
Consider the following best practices:
Reduce data volume.
  • Reduce the field precision of each column to the length that you need.
  • Reduce the number of columns. Remove unconnected fields.
  • Reduce the number of rows. Use a source filter in a Source or Filter transformation.
  • Use a Filter transformation early in the mapping to remove unnecessary data.
Enable source partitioning.
Enable source partitioning whenever possible. The mapping task divides the source data into partitions and processes the partitions concurrently.
Optimize data conversion.
  • Maintain consistency from source to target. Keep the same data type and precision across the pipeline whenever possible.
  • For variable length data types, use precision as small as possible.
  • Use the String data type instead of dates if no operations are done.
  • Eliminate unnecessary data type conversions. For example, if a mapping moves data from an Integer column to a Decimal column, then back to an Integer column, the unnecessary data type conversion slows performance.
  • Use integer values in place of other data types when performing comparisons using Lookup and Filter transformations wherever possible.
Use local flat file staging.
When a mapping writes to or reads from a cloud data warehouse, you can optimize the mapping performance by configuring the Secure Agent to stage data locally in a temporary folder before writing to or reading from the cloud data warehouse end point.
In the Secure Agent properties, set the staging property INFA_DTM_STAGING_ENABLED_CONNECTORS for Tomcat to the plugin ID of the cloud data warehouse connector.
Data Integration
creates a flat file locally to stage the data and then loads the data from the staging file to the data warehouse target or unloads data from the data warehouse source and stages it locally. For more information, see the individual cloud connector performance tuning guides.
Tune hardware.
For example, improve network speed and use multiple CPUs.
Consider the Secure Agent virtual machine instance type.
Choose performant cloud instances such as Amazon Elastic Compute Cloud (EC2), Azure Virtual Machine (Azure VM), or Google Cloud Platform (GCP)​ based on the resource requirements.

0 COMMENTS

We’d like to hear from you!