Table of Contents

Search

  1. Preface
  2. Performance Tuning Overview
  3. Target Optimization
  4. Source Optimization
  5. Transformation Optimization
  6. Mapping Optimization
  7. Partitioned Mapping Optimization
  8. Run-time Optimization
  9. SQL Data Service Optimization
  10. Web Service Optimization
  11. Connections Optimization
  12. Data Transformation Optimization

Performance Tuning Guide

Performance Tuning Guide

Optimize Transformations for Partitioning

Optimize Transformations for Partitioning

When the Data Integration Service uses multiple threads to run an Aggregator, Joiner, Rank, or Sorter transformation, the service uses cache partitioning to divide the cache size across the threads. To optimize performance for cache partitioning, configure multiple cache directories.
A Lookup transformation can only use a single cache directory.
Consider the following solution to reduce bottlenecks for partitioned Aggregator, Joiner, Rank, and Sorter transformations:
Configure multiple cache directories.
Cache partitioning creates a separate cache for each partition that processes an Aggregator, Joiner, Rank, or Sorter transformation. During cache partitioning, each partition stores different data in a separate cache. Each cache contains the rows needed by that partition. Cache partitioning optimizes mapping performance because each thread queries a separate cache in parallel.
If the cache size is smaller than the amount of memory required to run the transformation, transformation threads write to the cache directory to store overflow values in cache files. When multiple threads write to a single directory, the mapping might encounter a bottleneck due to I/O contention. An I/O contention can occur when threads write data to the file system at the same time. When you configure multiple cache directories, the Data Integration Service determines the cache directory for each transformation thread in a round-robin fashion.
In an Aggregator, Joiner, or Rank transformation, configure the cache directories in the
Cache Directory
advanced property. Use the default CacheDir system parameter value if an administrator entered multiple directories separated by semicolons for the
Cache Directory
property for the Data Integration Service in the Administrator tool. Or, you can enter a different value to configure multiple cache directories specific to the transformation.
In a Sorter transformation, configure the cache directories in the
Work Directory
advanced property. Use the default TempDir system parameter value if an administrator entered multiple directories separated by semicolons for the
Temporary Directories
property for the Data Integration Service in the Administrator tool. Or, you can enter a different value to configure multiple cache directories specific to the transformation.

0 COMMENTS

We’d like to hear from you!