Aggregator transformations often slow performance because they must group data before processing it. Aggregator transformations need additional memory to hold intermediate group results.
Consider the following solutions for Aggregator transformation bottlenecks:
Group by simple columns.
You can optimize Aggregator transformations when you group by simple columns. When possible, use numbers instead of string and dates in the columns used for the GROUP BY. Avoid complex expressions in the Aggregator expressions.
Use sorted input.
To increase mapping performance, sort data for the Aggregator transformation. Use the Sorted Input option to sort data.
The Sorted Input option decreases the use of aggregate caches. When you use the Sorted Input option, the Data Integration Service assumes all data is sorted by group. As the Data Integration Service reads rows for a group, it performs aggregate calculations. When necessary, it stores group information in memory.
The Sorted Input option reduces the amount of data cached during the mapping and improves performance. Use the Sorted Input option or a Sorter transformation to pass sorted data to the Aggregator transformation.
You can increase performance when you use the Sorted Input option in mappings with multiple partitions.
Filter data before you aggregate it.
If you use a Filter transformation in the mapping, place the transformation before the Aggregator transformation to reduce unnecessary aggregation.
Limit port connections.
Limit the number of connected input/output or output ports to reduce the amount of data the Aggregator transformation stores in the data cache.