Table of Contents

Search

  1. Preface
  2. Understanding Pipeline Partitioning
  3. Partition Points
  4. Partition Types
  5. Pushdown Optimization
  6. Pushdown Optimization and Transformations
  7. Real-time Processing
  8. Commit Points
  9. Row Error Logging
  10. Workflow Recovery
  11. Stopping and Aborting
  12. Concurrent Workflows
  13. Grid Processing
  14. Load Balancer
  15. Workflow Variables
  16. Parameters and Variables in Sessions
  17. Parameter Files
  18. FastExport
  19. External Loading
  20. FTP
  21. Session Caches
  22. Incremental Aggregation
  23. Session Log Interface
  24. Understanding Buffer Memory
  25. High Precision Data

Advanced Workflow Guide

Advanced Workflow Guide

Integration Service Processing for Incremental Aggregation

Integration Service Processing for Incremental Aggregation

The first time you run an incremental aggregation session, the Integration Service processes the entire source. At the end of the session, the Integration Service stores aggregate data from that session run in two files, the index file and the data file. The Integration Service creates the files in the cache directory specified in the Aggregator transformation properties.
Each subsequent time you run the session with incremental aggregation, you use the incremental source changes in the session. For each input record, the Integration Service checks historical information in the index file for a corresponding group. If it finds a corresponding group, the Integration Service performs the aggregate operation incrementally, using the aggregate data for that group, and saves the incremental change. If it does not find a corresponding group, the Integration Service creates a new group and saves the record data.
When writing to the target, the Integration Service applies the changes to the existing target. It saves modified aggregate data in the index and data files to be used as historical data the next time you run the session.
If the source changes significantly and you want the Integration Service to continue saving aggregate data for future incremental changes, configure the Integration Service to overwrite existing aggregate data with new aggregate data.
Each subsequent time you run a session with incremental aggregation, the Integration Service creates a backup of the incremental aggregation files. The cache directory for the Aggregator transformation must contain enough disk space for two sets of the files.
When you partition a session that uses incremental aggregation, the Integration Service creates one set of cache files for each partition.
The Integration Service creates new aggregate data, instead of using historical data, when you perform one of the following tasks:
  • Save a new version of the mapping.
  • Configure the session to reinitialize the aggregate cache.
  • Move the aggregate files without correcting the configured path or directory for the files in the session properties.
  • Change the configured path or directory for the aggregate files without moving the files to the new location.
  • Delete cache files.
  • Decrease the number of partitions.
When the Integration Service rebuilds incremental aggregation files, the data in the previous files is lost.
To protect the incremental aggregation files from file corruption or disk failure, periodically back up the files.

0 COMMENTS

We’d like to hear from you!