Table of Contents

Search

  1. Preface
  2. Understanding Pipeline Partitioning
  3. Partition Points
  4. Partition Types
  5. Pushdown Optimization
  6. Pushdown Optimization and Transformations
  7. Real-time Processing
  8. Commit Points
  9. Row Error Logging
  10. Workflow Recovery
  11. Stopping and Aborting
  12. Concurrent Workflows
  13. Grid Processing
  14. Load Balancer
  15. Workflow Variables
  16. Parameters and Variables in Sessions
  17. Parameter Files
  18. FastExport
  19. External Loading
  20. FTP
  21. Session Caches
  22. Incremental Aggregation
  23. Session Log Interface
  24. Understanding Buffer Memory
  25. High Precision Data

Advanced Workflow Guide

Advanced Workflow Guide

Understanding Pipeline Partitioning Overview

Understanding Pipeline Partitioning Overview

You create a session for each mapping you want the Integration Service to run. Each mapping contains one or more pipelines. A pipeline consists of a source qualifier and all the transformations and targets that receive data from that source qualifier. When the Integration Service runs the session, it can achieve higher performance by partitioning the pipeline and performing the extract, transformation, and load for each partition in parallel.
A partition is a pipeline stage that executes in a single reader, transformation, or writer thread. The number of partitions in any pipeline stage equals the number of threads in the stage. By default, the Integration Service creates one partition in every pipeline stage.
If you have the Partitioning option, you can configure multiple partitions for a single pipeline stage. You can configure partitioning information that controls the number of reader, transformation, and writer threads that the master thread creates for the pipeline. You can configure how the Integration Service reads data from the source, distributes rows of data to each transformation, and writes data to the target. You can configure the number of source and target connections to use.
Complete the following tasks to configure partitions for a session:
  • Set partition attributes including partition points, the number of partitions, and the partition types.
  • You can enable the Integration Service to set partitioning at run time. When you enable dynamic partitioning, the Integration Service scales the number of session partitions based on factors such as the source database partitions or the number of nodes in a grid.
  • After you configure a session for partitioning, you can configure memory requirements and cache directories for each transformation.
  • The Integration Service evaluates mapping variables for each partition in a target load order group. You can use variable functions in the mapping to set the variable values.
  • When you create multiple partitions in a pipeline, the Workflow Manager verifies that the Integration Service can maintain data consistency in the session using the partitions. When you edit object properties in the session, you can impact partitioning and cause a session to fail.
  • You add or edit partition points in the session properties. When you change partition points you can define the partition type and add or delete partitions.


Updated June 03, 2019