One of the most important ways to improve performance is to avoid unnecessary file sharing. When properly configured, shared file systems can provide good performance for the sequential access of source and target files. However, the random access required for persistent cache files, especially large persistent cache files, can be more problematic.
Use the following guidelines for configuring persistent cache files, such as persistent dynamic lookups, for a grid with a shared file system:
When possible, configure the session cache size to keep smaller persistent cache files in memory.
Add a Sorter transformation to the mapping to sort the input rows before the persistent lookup. Shifting the work from the persistent lookup to the Sorter transformation can improve performance because the Sorter transformation can use the local file system.
Group rows that require access to the same page of the lookup cache to minimize the number of times the Integration Service reads each page of the cache.
When the size of input data is large, use source-based commits to manage input data to allow sorting to be performed in memory.
For example, you have a 4 GB persistent dynamic lookup that cannot be reduced without changing the mapping logic and you have 10 GB of source data. First add a Sorter transformation to sort input data to reduce random access of the lookup cache, then complete the following tasks:
Configure the session to perform source-based commits with 1 GB commit intervals.
Set the Sorter transformation transaction scope to Transaction.
Configure the Sorter transformation for a 1 GB cache size, enough for the source input.
With this configuration, the Integration Service sorts 1 GB of input data at a time and passes rows to the persistent lookup that require access to similar data in the cache.
If more than one file system is available, configure the cache files for each partition to use different file systems.
Configure the sessions to distribute the files to different file systems if more than one file system is available.