The Data Integration Service allocates cache memory for Aggregator, Joiner, Lookup, Rank, and Sorter transformations in a mapping. The Data Integration Service creates index and data caches for the Aggregator, Joiner, Lookup, and Rank transformations. The Data Integration Service creates one cache for the Sorter transformation.
You can configure the cache sizes for these transformations. The cache size determines how much memory the Data Integration Service allocates for each transformation cache at the start of a mapping run.
If the cache size is larger than the available memory on the machine, the Data Integration Service cannot allocate enough memory and the mapping run fails.
If the cache size is smaller than the amount of memory required to run the transformation, the Data Integration Service processes some of the transformation in memory and stores overflow data in cache files. When the Data Integration Service pages cache files to the disk, processing time increases. For optimal performance, configure the cache size so that the Data Integration Service can process the complete transformation in memory.
By default, the Data Integration Service automatically calculates the memory requirements at run time, based on the maximum amount of memory that the service can allocate. After you run a mapping in auto cache mode, you can tune the cache sizes for the transformations. You analyze the transformation statistics in the mapping log to determine the cache sizes required for optimal performance, and then configure specific cache sizes for the transformations.