Preface
Understanding Pipeline Partitioning
- Understanding Pipeline Partitioning Overview
- Partitioning Attributes
- Dynamic Partitioning
- Cache Partitioning
- Mapping Variables in Partitioned Pipelines
- Partitioning Rules
  - Partition Restrictions for Editing Objects
    - Before You Create a Session
    - After You Create a Session with Multiple Partitions
  - Partition Restrictions for PowerExchange
- Configuring Partitioning
Partition Points
- Partition Points Overview
- Adding and Deleting Partition Points
  - Rules and Guidelines for Adding and Deleting Partition Points
- Partitioning Relational Sources
  - Entering an SQL Query
  - Entering a Filter Condition
- Partitioning File Sources
- Partitioning Relational Targets
  - Database Compatibility
- Partitioning File Targets
  - Configuring Connection Settings
  - Configuring File Properties
    - Configuring Commands for Partitioned File Targets
    - Configuring Merge Options
- Partitioning Custom Transformations
- Partitioning Joiner Transformations
- Partitioning Lookup Transformations
  - Cache Partitioning Lookup Transformations
    - Sharing Partitioned Caches
  - Partitioning Pipeline Lookup Transformation Cache
- Partitioning Sequence Generator Transformations
- Partitioning Sorter Transformations
  - Configuring Sorter Transformation Work Directories
- Partitioning XML Generator Transformations
- Restrictions for Transformations
  - Restrictions for Numerical Functions
Partition Types
- Partition Types Overview
  - Setting Partition Types in the Pipeline
- Setting Partition Types
- Database Partitioning Partition Type
  - Partitioning Database Sources
  - Target Database Partitioning
    - Rules and Guidelines for Target Database Partitioning
- Hash Auto-Keys Partition Type
- Hash User Keys Partition Type
- Key Range Partition Type
  - Adding a Partition Key
  - Adding Key Ranges
    - Adding Filter Conditions
    - Rules and Guidelines for Creating Key Ranges
- Pass-Through Partition Type
- Round-Robin Partition Type
Pushdown Optimization
- Pushdown Optimization Overview
- Pushdown Optimization Types
- Active and Idle Databases
- Working with Databases
- Pushdown Compatibility
- Error Handling, Logging, and Recovery
- Working with Slowly Changing Dimensions
- Working with Sequences and Views
- Using the $$PushdownConfig Mapping Parameter
- Configuring Sessions for Pushdown Optimization
Pushdown Optimization and Transformations
- Pushdown Optimization and Transformations Overview
  - General Pushdown Restrictions
- Aggregator Transformation
- Expression Transformation
- Filter Transformation
- Joiner Transformation
- Lookup Transformation
  - Unconnected Lookup Transformation
  - Lookup Transformation with an SQL Override
- Router Transformation
- Sequence Generator Transformation
- Sorter Transformation
- Source Qualifier Transformation
  - Source Qualifier Transformation with an SQL Override
- Target
- Union Transformation
- Update Strategy Transformation
Real-time Processing
- Real-time Processing Overview
- Understanding Real-time Data
- Configuring Real-time Sessions
- Terminating Conditions
- Flush Latency
- Commit Type
- Message Recovery
  - Prerequisites
  - Steps to Enable Message Recovery
- Recovery File
- Recovery Table
- Recovery Queue and Recovery Topic
  - Message Processing
  - Message Recovery
- Recovery Ignore List
- Stopping Real-time Sessions
- Restarting and Recovering Real-time Sessions
- Rules and Guidelines for Real-time Sessions
- Rules and Guidelines for Message Recovery
- Real-time Processing Example
- PowerCenter Real-time Products
Commit Points
- Commit Points Overview
- Target-Based Commits
- Source-Based Commits
  - Determining the Commit Source
  - Switching from Source-Based to Target-Based Commit
    - Connecting XML Sources in a Mapping
    - Connecting Multiple Output Group Custom Transformations in a Mapping
- User-Defined Commits
  - Rolling Back Transactions
- Understanding Transaction Control
- Setting Commit Properties
Row Error Logging
- Row Error Logging Overview
  - Error Log Code Pages
- Understanding the Error Log Tables
- Understanding the Error Log File
- Configuring Error Log Options
Workflow Recovery
- Workflow Recovery Overview
- State of Operation
- Recovery Options
- Suspending the Workflow
  - Configuring Suspension Email
- Configuring Workflow Recovery
  - Recovering Stopped, Aborted, and Terminated Workflows
  - Recovering Suspended Workflows
- Configuring Task Recovery
  - Task Recovery Strategies
    - Command Task Strategies
    - Session Task Strategies
  - Automatically Recovering Terminated Tasks
- Resuming Sessions
- Working with Repeatable Data
- Steps to Recover Workflows and Tasks
- Rules and Guidelines for Session Recovery
  - Configuring Recovery to Resume from the Last Checkpoint
  - Unrecoverable Workflows or Tasks
Stopping and Aborting
- Stopping and Aborting Overview
- Types of Errors
  - Threshold Errors
  - Fatal Errors
- Integration Service Handling for Session Failure
- Stopping or Aborting the Workflow
  - Stopping or Aborting a Task
    - Stopping or Aborting a Session Task
- Steps to Stop or Abort
Concurrent Workflows
- Concurrent Workflows Overview
- Configuring Unique Workflow Instances
  - Recovering Workflow Instances by Instance Name
  - Rules and Guidelines for Running Concurrent Instances of the Same Instance Name
- Configuring Concurrent Workflows of the Same Name
- Using Parameters and Variables
  - Accessing the Run Instance Name or Run ID
- Steps to Configure Concurrent Workflows
- Starting and Stopping Concurrent Workflows
- Monitoring Concurrent Workflows
- Viewing Session and Workflow Logs
  - Log Files for Unique Workflow Instances
  - Log Files for Workflow Instances of the Same Name
- Rules and Guidelines for Concurrent Workflows
Grid Processing
- Grid Processing Overview
- Running Workflows on a Grid
- Running Sessions on a Grid
- Working with Partition Groups
- Grid Connectivity and Recovery
- Configuring a Workflow or Session to Run on a Grid
  - Rules and Guidelines for Configuring a Workflow or Session to Run on a Grid
Load Balancer
- Load Balancer Overview
- Assigning Service Levels to Workflows
- Assigning Resources to Tasks
Workflow Variables
- Workflow Variables Overview
- Predefined Workflow Variables
- User-Defined Workflow Variables
- Using Worklet Variables
- Assigning Variable Values in a Worklet
  - Passing Variable Values between Worklets
  - Configuring Variable Assignments
Parameters and Variables in Sessions
- Working with Session Parameters
- Mapping Parameters and Variables in Sessions
- Assigning Parameter and Variable Values in a Session
  - Passing Parameter and Variable Values between Sessions
  - Configuring Variable Assignments
Parameter Files
- Parameter Files Overview
- Parameter and Variable Types
- Where to Use Parameters and Variables
- Overriding Connection Attributes in the Parameter File
- Parameter File Structure
- Configuring the Parameter File Name and Location
  - Using a Parameter File with Workflows or Sessions
  - Using a Parameter File with pmcmd
- Parameter File Example
- Guidelines for Creating Parameter Files
- Troubleshooting Parameters and Parameter Files
- Tips for Parameters and Parameter Files
FastExport
- Using FastExport Overview
- Step 1. Create a FastExport Connection
  - Verifying the Code Page Mapping File
- Step 2. Change the Reader
- Step 3. Change the Source Connection
- Step 4. Override the Control File (Optional)
- Rules and Guidelines for Using FastExport
External Loading
- External Loading Overview
  - Before You Begin
- External Loader Behavior
- Loading to IBM DB2
- Loading to Oracle
- Loading to Sybase IQ
- Loading to Teradata
- Configuring External Loading in a Session
- Troubleshooting External Loading
FTP
- FTP Overview
  - Rules and Guidelines for Using FTP
- SFTP
- Integration Service Behavior
  - Using FTP with Source Files
  - Using FTP with Target Files
- Configuring FTP in a Session
Session Caches
- Session Caches Overview
- Cache Memory
- Cache Files
  - Naming Convention for Cache Files
  - Cache File Directory
- Configuring the Cache Size
- Cache Partitioning
  - Configuring the Cache Size for Cache Partitioning
- Aggregator Caches
- Joiner Caches
- Lookup Caches
  - Sharing Caches
  - Configuring the Cache Sizes for a Lookup Transformation
- Rank Caches
  - Configuring the Cache Sizes for a Rank Transformation
- Sorter Caches
  - Configuring the Cache Size for a Sorter Transformation
- XML Target Caches
  - Configuring the Cache Size for an XML Target
- Optimizing the Cache Size
Incremental Aggregation
- Incremental Aggregation Overview
- Integration Service Processing for Incremental Aggregation
- Reinitializing the Aggregate Files
- Moving or Deleting the Aggregate Files
  - Finding Index and Data Files
- Partitioning Guidelines with Incremental Aggregation
- Preparing for Incremental Aggregation
  - Configuring the Mapping
  - Configuring the Session
Session Log Interface
- Session Log Interface Overview
- Implementing the Session Log Interface
  - The Integration Service and the Session Log Interface
  - Rules and Guidelines for Implementing the Session Log Interface
- Functions in the Session Log Interface
- Session Log Interface Example
  - Building the External Session Log Library
    - Building the Library in UNIX
    - Building the Library in Windows
  - Using the External Session Log Library
Understanding Buffer Memory
- Understanding Buffer Memory Overview
- Automatic Buffer Memory Settings
  - Using Session Configuration Objects for Memory Configuration
- Configuring Buffer Memory
- Configuring Session Cache Memory
  - Session Cache Limits
  - Configuring Automatic Memory Settings for Session Caches
High Precision Data
- High Precision Data Overview
- Bigint
- Decimal

Advanced Workflow Guide

10.4.1
- 10.5.9
- 10.5.8
- 10.5.7
- 10.5.6
- 10.5
- 10.4.0

Back Next

Setting Partition Types in the Pipeline

You can create different partition types at different points in the pipeline.

The following figure shows a mapping where you can create partition types to increase session performance:

The mapping includes the following elements: Items (flat file), SQ_Items, FIL_ActiveItems, SRT_ItemsDescSort, AGG_AvgCostAndPrice, T_ITEM_PRICES (Oracle)

This mapping reads data about items and calculates average wholesale costs and prices. The mapping must read item information from three flat files of various sizes, and then filter out discontinued items. It sorts the active items by description, calculates the average prices and wholesale costs, and writes the results to a relational database in which the target tables are partitioned by key range.

You can delete the default partition point at the Aggregator transformation because hash auto-keys partitioning at the Sorter transformation sends all rows that contain items with the same description to the same partition. Therefore, the Aggregator transformation receives data for all items with the same description in one partition and can calculate the average costs and prices for this item correctly.

When you use this mapping in a session, you can increase session performance by defining different partition types at the following partition points in the pipeline:

Source qualifier.
To read data from the three flat files concurrently, you must specify three partitions at the source qualifier. Accept the default partition type, pass-through.

Filter transformation.
Since the source files vary in size, each partition processes a different amount of data. Set a partition point at the Filter transformation, and choose round-robin partitioning to balance the load going into the Filter transformation.

Sorter transformation.
To eliminate overlapping groups in the Sorter and Aggregator transformations, use hash auto-keys partitioning at the Sorter transformation. This causes the Integration Service to group all items with the same description into the same partition before the Sorter and Aggregator transformations process the rows. You can delete the default partition point at the Aggregator transformation.

Target.
Since the target tables are partitioned by key range, specify key range partitioning at the target to optimize writing data to the target.

Rename Saved Search

Table of Contents

Advanced Workflow Guide

Advanced Workflow Guide

Setting Partition Types in the Pipeline

Setting Partition Types in the Pipeline