Preface
Understanding Pipeline Partitioning
- Understanding Pipeline Partitioning Overview
- Partitioning Attributes
- Dynamic Partitioning
- Cache Partitioning
- Mapping Variables in Partitioned Pipelines
- Partitioning Rules
  - Partition Restrictions for Editing Objects
    - Before You Create a Session
    - After You Create a Session with Multiple Partitions
  - Partition Restrictions for PowerExchange
- Configuring Partitioning
Partition Points
- Partition Points Overview
- Adding and Deleting Partition Points
  - Rules and Guidelines for Adding and Deleting Partition Points
- Partitioning Relational Sources
  - Entering an SQL Query
  - Entering a Filter Condition
- Partitioning File Sources
- Partitioning Relational Targets
  - Database Compatibility
- Partitioning File Targets
  - Configuring Connection Settings
  - Configuring File Properties
    - Configuring Commands for Partitioned File Targets
    - Configuring Merge Options
- Partitioning Custom Transformations
- Partitioning Joiner Transformations
- Partitioning Lookup Transformations
  - Cache Partitioning Lookup Transformations
    - Sharing Partitioned Caches
  - Partitioning Pipeline Lookup Transformation Cache
- Partitioning Sequence Generator Transformations
- Partitioning Sorter Transformations
  - Configuring Sorter Transformation Work Directories
- Partitioning XML Generator Transformations
- Restrictions for Transformations
  - Restrictions for Numerical Functions
Partition Types
- Partition Types Overview
  - Setting Partition Types in the Pipeline
- Setting Partition Types
- Database Partitioning Partition Type
  - Partitioning Database Sources
  - Target Database Partitioning
    - Rules and Guidelines for Target Database Partitioning
- Hash Auto-Keys Partition Type
- Hash User Keys Partition Type
- Key Range Partition Type
  - Adding a Partition Key
  - Adding Key Ranges
    - Adding Filter Conditions
    - Rules and Guidelines for Creating Key Ranges
- Pass-Through Partition Type
- Round-Robin Partition Type
Pushdown Optimization
- Pushdown Optimization Overview
- Pushdown Optimization Types
- Active and Idle Databases
- Working with Databases
- Pushdown Compatibility
- Error Handling, Logging, and Recovery
- Working with Slowly Changing Dimensions
- Working with Sequences and Views
- Using the $$PushdownConfig Mapping Parameter
- Configuring Sessions for Pushdown Optimization
Pushdown Optimization and Transformations
- Pushdown Optimization and Transformations Overview
  - General Pushdown Restrictions
- Aggregator Transformation
- Expression Transformation
- Filter Transformation
- Joiner Transformation
- Lookup Transformation
  - Unconnected Lookup Transformation
  - Lookup Transformation with an SQL Override
- Router Transformation
- Sequence Generator Transformation
- Sorter Transformation
- Source Qualifier Transformation
  - Source Qualifier Transformation with an SQL Override
- Target
- Union Transformation
- Update Strategy Transformation
Real-time Processing
- Real-time Processing Overview
- Understanding Real-time Data
- Configuring Real-time Sessions
- Terminating Conditions
- Flush Latency
- Commit Type
- Message Recovery
  - Prerequisites
  - Steps to Enable Message Recovery
- Recovery File
- Recovery Table
- Recovery Queue and Recovery Topic
  - Message Processing
  - Message Recovery
- Recovery Ignore List
- Stopping Real-time Sessions
- Restarting and Recovering Real-time Sessions
- Rules and Guidelines for Real-time Sessions
- Rules and Guidelines for Message Recovery
- Real-time Processing Example
- PowerCenter Real-time Products
Commit Points
- Commit Points Overview
- Target-Based Commits
- Source-Based Commits
  - Determining the Commit Source
  - Switching from Source-Based to Target-Based Commit
    - Connecting XML Sources in a Mapping
    - Connecting Multiple Output Group Custom Transformations in a Mapping
- User-Defined Commits
  - Rolling Back Transactions
- Understanding Transaction Control
- Setting Commit Properties
Row Error Logging
- Row Error Logging Overview
  - Error Log Code Pages
- Understanding the Error Log Tables
- Understanding the Error Log File
- Configuring Error Log Options
Workflow Recovery
- Workflow Recovery Overview
- State of Operation
- Recovery Options
- Suspending the Workflow
  - Configuring Suspension Email
- Configuring Workflow Recovery
  - Recovering Stopped, Aborted, and Terminated Workflows
  - Recovering Suspended Workflows
- Configuring Task Recovery
  - Task Recovery Strategies
    - Command Task Strategies
    - Session Task Strategies
  - Automatically Recovering Terminated Tasks
- Resuming Sessions
- Working with Repeatable Data
- Steps to Recover Workflows and Tasks
- Rules and Guidelines for Session Recovery
  - Configuring Recovery to Resume from the Last Checkpoint
  - Unrecoverable Workflows or Tasks
Stopping and Aborting
- Stopping and Aborting Overview
- Types of Errors
  - Threshold Errors
  - Fatal Errors
- Integration Service Handling for Session Failure
- Stopping or Aborting the Workflow
  - Stopping or Aborting a Task
    - Stopping or Aborting a Session Task
- Steps to Stop or Abort
Concurrent Workflows
- Concurrent Workflows Overview
- Configuring Unique Workflow Instances
  - Recovering Workflow Instances by Instance Name
  - Rules and Guidelines for Running Concurrent Instances of the Same Instance Name
- Configuring Concurrent Workflows of the Same Name
- Using Parameters and Variables
  - Accessing the Run Instance Name or Run ID
- Steps to Configure Concurrent Workflows
- Starting and Stopping Concurrent Workflows
- Monitoring Concurrent Workflows
- Viewing Session and Workflow Logs
  - Log Files for Unique Workflow Instances
  - Log Files for Workflow Instances of the Same Name
- Rules and Guidelines for Concurrent Workflows
Grid Processing
- Grid Processing Overview
- Running Workflows on a Grid
- Running Sessions on a Grid
- Working with Partition Groups
- Grid Connectivity and Recovery
- Configuring a Workflow or Session to Run on a Grid
  - Rules and Guidelines for Configuring a Workflow or Session to Run on a Grid
Load Balancer
- Load Balancer Overview
- Assigning Service Levels to Workflows
- Assigning Resources to Tasks
Workflow Variables
- Workflow Variables Overview
- Predefined Workflow Variables
- User-Defined Workflow Variables
- Using Worklet Variables
- Assigning Variable Values in a Worklet
  - Passing Variable Values between Worklets
  - Configuring Variable Assignments
Parameters and Variables in Sessions
- Working with Session Parameters
- Mapping Parameters and Variables in Sessions
- Assigning Parameter and Variable Values in a Session
  - Passing Parameter and Variable Values between Sessions
  - Configuring Variable Assignments
Parameter Files
- Parameter Files Overview
- Parameter and Variable Types
- Where to Use Parameters and Variables
- Overriding Connection Attributes in the Parameter File
- Parameter File Structure
- Configuring the Parameter File Name and Location
  - Using a Parameter File with Workflows or Sessions
  - Using a Parameter File with pmcmd
- Parameter File Example
- Guidelines for Creating Parameter Files
- Troubleshooting Parameters and Parameter Files
- Tips for Parameters and Parameter Files
FastExport
- Using FastExport Overview
- Step 1. Create a FastExport Connection
  - Verifying the Code Page Mapping File
- Step 2. Change the Reader
- Step 3. Change the Source Connection
- Step 4. Override the Control File (Optional)
- Rules and Guidelines for Using FastExport
External Loading
- External Loading Overview
  - Before You Begin
- External Loader Behavior
- Loading to IBM DB2
- Loading to Oracle
- Loading to Sybase IQ
- Loading to Teradata
- Configuring External Loading in a Session
- Troubleshooting External Loading
FTP
- FTP Overview
  - Rules and Guidelines for Using FTP
- SFTP
- Integration Service Behavior
  - Using FTP with Source Files
  - Using FTP with Target Files
- Configuring FTP in a Session
Session Caches
- Session Caches Overview
- Cache Memory
- Cache Files
  - Naming Convention for Cache Files
  - Cache File Directory
- Configuring the Cache Size
- Cache Partitioning
  - Configuring the Cache Size for Cache Partitioning
- Aggregator Caches
- Joiner Caches
- Lookup Caches
  - Sharing Caches
  - Configuring the Cache Sizes for a Lookup Transformation
- Rank Caches
  - Configuring the Cache Sizes for a Rank Transformation
- Sorter Caches
  - Configuring the Cache Size for a Sorter Transformation
- XML Target Caches
  - Configuring the Cache Size for an XML Target
- Optimizing the Cache Size
Incremental Aggregation
- Incremental Aggregation Overview
- Integration Service Processing for Incremental Aggregation
- Reinitializing the Aggregate Files
- Moving or Deleting the Aggregate Files
  - Finding Index and Data Files
- Partitioning Guidelines with Incremental Aggregation
- Preparing for Incremental Aggregation
  - Configuring the Mapping
  - Configuring the Session
Session Log Interface
- Session Log Interface Overview
- Implementing the Session Log Interface
  - The Integration Service and the Session Log Interface
  - Rules and Guidelines for Implementing the Session Log Interface
- Functions in the Session Log Interface
- Session Log Interface Example
  - Building the External Session Log Library
    - Building the Library in UNIX
    - Building the Library in Windows
  - Using the External Session Log Library
Understanding Buffer Memory
- Understanding Buffer Memory Overview
- Automatic Buffer Memory Settings
  - Using Session Configuration Objects for Memory Configuration
- Configuring Buffer Memory
- Configuring Session Cache Memory
  - Session Cache Limits
  - Configuring Automatic Memory Settings for Session Caches
High Precision Data
- High Precision Data Overview
- Bigint
- Decimal

Advanced Workflow Guide

10.4.1
- 10.5.8
- 10.5.7
- 10.5.6
- 10.5
- 10.4.0

Back Next

Partition Types

When you configure the partitioning information for a pipeline, you must define a partition type at each partition point in the pipeline. The partition type determines how the PowerCenter Integration Service redistributes data across partition points.

The PowerCenter Integration Service creates a default partition type at each partition point. If you have the Partitioning option, you can change the partition type. The partition type controls how the PowerCenter Integration Service distributes data among partitions at partition points. You can create different partition types at different points in the pipeline.

You can define the following partition types in the Workflow Manager:

Database partitioning.
The PowerCenter Integration Service queries the IBM DB2 or Oracle database system for table partition information. It reads partitioned data from the corresponding nodes in the database. You can use database partitioning with Oracle or IBM DB2 source instances on a multi-node tablespace. You can use database partitioning with DB2 targets.

Hash auto-keys.
The PowerCenter Integration Service uses a hash function to group rows of data among partitions. The PowerCenter Integration Service groups the data based on a partition key. The PowerCenter Integration Service uses all grouped or sorted ports as a compound partition key. You may need to use hash auto-keys partitioning at Rank, Sorter, and unsorted Aggregator transformations.

Hash user keys.
The PowerCenter Integration Service uses a hash function to group rows of data among partitions. You define the number of ports to generate the partition key.

Key range.
With key range partitioning, the PowerCenter Integration Service distributes rows of data based on a port or set of ports that you define as the partition key. For each port, you define a range of values. The PowerCenter Integration Service uses the key and ranges to send rows to the appropriate partition. Use key range partitioning when the sources or targets in the pipeline are partitioned by key range.

Pass-through.
In pass-through partitioning, the PowerCenter Integration Service processes data without redistributing rows among partitions. All rows in a single partition stay in the partition after crossing a pass-through partition point. Choose pass-through partitioning when you want to create an additional pipeline stage to improve performance, but do not want to change the distribution of data across partitions.

Round-robin.
The PowerCenter Integration Service distributes blocks of data to one or more partitions. Use round-robin partitioning so that each partition processes rows based on the number and size of the blocks.

Rename Saved Search

Table of Contents

Advanced Workflow Guide

Advanced Workflow Guide

Partition Types

Partition Types