Table of Contents

Search

  1. Preface
  2. Analyst Service
  3. Content Management Service
  4. Data Integration Service
  5. Data Integration Service Architecture
  6. Data Integration Service Management
  7. Data Integration Service Grid
  8. Data Integration Service Applications
  9. Metadata Manager Service
  10. Model Repository Service
  11. PowerCenter Integration Service
  12. PowerCenter Integration Service Architecture
  13. High Availability for the PowerCenter Integration Service
  14. PowerCenter Repository Service
  15. PowerCenter Repository Management
  16. PowerExchange Listener Service
  17. PowerExchange Logger Service
  18. SAP BW Service
  19. Search Service
  20. System Services
  21. Test Data Manager Service
  22. Web Services Hub
  23. Application Service Upgrade
  24. Application Service Databases
  25. Connecting to Databases from Windows
  26. Connecting to Databases from UNIX
  27. Updating the DynamicSections Parameter of a DB2 Database
  28. Reporting Service (Deprecated)
  29. Reporting and Dashboards Service (Deprecated)

One Thread for Each Pipeline Stage

One Thread for Each Pipeline Stage

When maximum parallelism is set to 1, partitioning is disabled. The Data Integration Service separates a mapping into pipeline stages and uses one reader thread, one transformation thread, and one writer thread to process each stage.
Each mapping contains one or more pipelines. A pipeline consists of a Read transformation and all the transformations that receive data from that Read transformation. The Data Integration Service separates a mapping pipeline into pipeline stages and then performs the extract, transformation, and load for each pipeline stage in parallel.
Partition points mark the boundaries in a pipeline and divide the pipeline into stages. For every mapping pipeline, the Data Integration Service adds a partition point after the Read transformation and before the Write transformation to create multiple pipeline stages.
Each pipeline stage runs in one of the following threads:
  • Reader thread that controls how the Data Integration Service extracts data from the source.
  • Transformation thread that controls how the Data Integration Service processes data in the pipeline.
  • Writer thread that controls how the Data Integration Service loads data to the target.
The following figure shows a mapping separated into a reader pipeline stage, a transformation pipeline stage, and a writer pipeline stage:
The source and target are partition points. The reader pipeline stage contains a source, the transformation pipeline stage contains a Filter and an Expression transformation, and the writer pipeline stage contains the target.
Because the pipeline contains three stages, the Data Integration Service can process three sets of rows concurrently and optimize mapping performance. For example, while the reader thread processes the third row set, the transformation thread processes the second row set, and the writer thread processes the first row set.
The following table shows how multiple threads can concurrently process three sets of rows:
Reader Thread
Transformation Thread
Writer Thread
Row Set 1
-
-
Row Set 2
Row Set 1
-
Row Set 3
Row Set 2
Row Set 1
Row Set 4
Row Set 3
Row Set 2
Row Set n
Row Set (n-1)
Row Set (n-2)
If the mapping pipeline contains transformations that perform complicated calculations, processing the transformation pipeline stage can take a long time. To optimize performance, the Data Integration Service adds partition points before some transformations to create an additional transformation pipeline stage.