Preface
Understanding Pipeline Partitioning
- Understanding Pipeline Partitioning Overview
- Partitioning Attributes
- Dynamic Partitioning
- Cache Partitioning
- Mapping Variables in Partitioned Pipelines
- Partitioning Rules
  - Partition Restrictions for Editing Objects
    - Before You Create a Session
    - After You Create a Session with Multiple Partitions
  - Partition Restrictions for PowerExchange
- Configuring Partitioning
Partition Points
- Partition Points Overview
- Adding and Deleting Partition Points
  - Rules and Guidelines for Adding and Deleting Partition Points
- Partitioning Relational Sources
  - Entering an SQL Query
  - Entering a Filter Condition
- Partitioning File Sources
- Partitioning Relational Targets
  - Database Compatibility
- Partitioning File Targets
  - Configuring Connection Settings
  - Configuring File Properties
    - Configuring Commands for Partitioned File Targets
    - Configuring Merge Options
- Partitioning Custom Transformations
- Partitioning Joiner Transformations
- Partitioning Lookup Transformations
  - Cache Partitioning Lookup Transformations
    - Sharing Partitioned Caches
  - Partitioning Pipeline Lookup Transformation Cache
- Partitioning Sequence Generator Transformations
- Partitioning Sorter Transformations
  - Configuring Sorter Transformation Work Directories
- Partitioning XML Generator Transformations
- Restrictions for Transformations
  - Restrictions for Numerical Functions
Partition Types
- Partition Types Overview
  - Setting Partition Types in the Pipeline
- Setting Partition Types
- Database Partitioning Partition Type
  - Partitioning Database Sources
  - Target Database Partitioning
    - Rules and Guidelines for Target Database Partitioning
- Hash Auto-Keys Partition Type
- Hash User Keys Partition Type
- Key Range Partition Type
  - Adding a Partition Key
  - Adding Key Ranges
    - Adding Filter Conditions
    - Rules and Guidelines for Creating Key Ranges
- Pass-Through Partition Type
- Round-Robin Partition Type
Pushdown Optimization
- Pushdown Optimization Overview
- Pushdown Optimization Types
- Active and Idle Databases
- Working with Databases
- Pushdown Compatibility
- Error Handling, Logging, and Recovery
- Working with Slowly Changing Dimensions
- Working with Sequences and Views
- Using the $$PushdownConfig Mapping Parameter
- Configuring Sessions for Pushdown Optimization
Pushdown Optimization and Transformations
- Pushdown Optimization and Transformations Overview
  - General Pushdown Restrictions
- Aggregator Transformation
- Expression Transformation
- Filter Transformation
- Joiner Transformation
- Lookup Transformation
  - Unconnected Lookup Transformation
  - Lookup Transformation with an SQL Override
- Router Transformation
- Sequence Generator Transformation
- Sorter Transformation
- Source Qualifier Transformation
  - Source Qualifier Transformation with an SQL Override
- Target
- Union Transformation
- Update Strategy Transformation
Real-time Processing
- Real-time Processing Overview
- Understanding Real-time Data
- Configuring Real-time Sessions
- Terminating Conditions
- Flush Latency
- Commit Type
- Message Recovery
  - Prerequisites
  - Steps to Enable Message Recovery
- Recovery File
- Recovery Table
- Recovery Queue and Recovery Topic
  - Message Processing
  - Message Recovery
- Recovery Ignore List
- Stopping Real-time Sessions
- Restarting and Recovering Real-time Sessions
- Rules and Guidelines for Real-time Sessions
- Rules and Guidelines for Message Recovery
- Real-time Processing Example
- PowerCenter Real-time Products
Commit Points
- Commit Points Overview
- Target-Based Commits
- Source-Based Commits
  - Determining the Commit Source
  - Switching from Source-Based to Target-Based Commit
    - Connecting XML Sources in a Mapping
    - Connecting Multiple Output Group Custom Transformations in a Mapping
- User-Defined Commits
  - Rolling Back Transactions
- Understanding Transaction Control
- Setting Commit Properties
Row Error Logging
- Row Error Logging Overview
  - Error Log Code Pages
- Understanding the Error Log Tables
- Understanding the Error Log File
- Configuring Error Log Options
Workflow Recovery
- Workflow Recovery Overview
- State of Operation
- Recovery Options
- Suspending the Workflow
  - Configuring Suspension Email
- Configuring Workflow Recovery
  - Recovering Stopped, Aborted, and Terminated Workflows
  - Recovering Suspended Workflows
- Configuring Task Recovery
  - Task Recovery Strategies
    - Command Task Strategies
    - Session Task Strategies
  - Automatically Recovering Terminated Tasks
- Resuming Sessions
- Working with Repeatable Data
- Steps to Recover Workflows and Tasks
- Rules and Guidelines for Session Recovery
  - Configuring Recovery to Resume from the Last Checkpoint
  - Unrecoverable Workflows or Tasks
Stopping and Aborting
- Stopping and Aborting Overview
- Types of Errors
  - Threshold Errors
  - Fatal Errors
- Integration Service Handling for Session Failure
- Stopping or Aborting the Workflow
  - Stopping or Aborting a Task
    - Stopping or Aborting a Session Task
- Steps to Stop or Abort
Concurrent Workflows
- Concurrent Workflows Overview
- Configuring Unique Workflow Instances
  - Recovering Workflow Instances by Instance Name
  - Rules and Guidelines for Running Concurrent Instances of the Same Instance Name
- Configuring Concurrent Workflows of the Same Name
- Using Parameters and Variables
  - Accessing the Run Instance Name or Run ID
- Steps to Configure Concurrent Workflows
- Starting and Stopping Concurrent Workflows
- Monitoring Concurrent Workflows
- Viewing Session and Workflow Logs
  - Log Files for Unique Workflow Instances
  - Log Files for Workflow Instances of the Same Name
- Rules and Guidelines for Concurrent Workflows
Grid Processing
- Grid Processing Overview
- Running Workflows on a Grid
- Running Sessions on a Grid
- Working with Partition Groups
- Grid Connectivity and Recovery
- Configuring a Workflow or Session to Run on a Grid
  - Rules and Guidelines for Configuring a Workflow or Session to Run on a Grid
Load Balancer
- Load Balancer Overview
- Assigning Service Levels to Workflows
- Assigning Resources to Tasks
Workflow Variables
- Workflow Variables Overview
- Predefined Workflow Variables
- User-Defined Workflow Variables
- Using Worklet Variables
- Assigning Variable Values in a Worklet
  - Passing Variable Values between Worklets
  - Configuring Variable Assignments
Parameters and Variables in Sessions
- Working with Session Parameters
- Mapping Parameters and Variables in Sessions
- Assigning Parameter and Variable Values in a Session
  - Passing Parameter and Variable Values between Sessions
  - Configuring Variable Assignments
Parameter Files
- Parameter Files Overview
- Parameter and Variable Types
- Where to Use Parameters and Variables
- Overriding Connection Attributes in the Parameter File
- Parameter File Structure
- Configuring the Parameter File Name and Location
  - Using a Parameter File with Workflows or Sessions
  - Using a Parameter File with pmcmd
- Parameter File Example
- Guidelines for Creating Parameter Files
- Troubleshooting Parameters and Parameter Files
- Tips for Parameters and Parameter Files
FastExport
- Using FastExport Overview
- Step 1. Create a FastExport Connection
  - Verifying the Code Page Mapping File
- Step 2. Change the Reader
- Step 3. Change the Source Connection
- Step 4. Override the Control File (Optional)
- Rules and Guidelines for Using FastExport
External Loading
- External Loading Overview
  - Before You Begin
- External Loader Behavior
- Loading to IBM DB2
- Loading to Oracle
- Loading to Sybase IQ
- Loading to Teradata
- Configuring External Loading in a Session
- Troubleshooting External Loading
FTP
- FTP Overview
  - Rules and Guidelines for Using FTP
- SFTP
- Integration Service Behavior
  - Using FTP with Source Files
  - Using FTP with Target Files
- Configuring FTP in a Session
Session Caches
- Session Caches Overview
- Cache Memory
- Cache Files
  - Naming Convention for Cache Files
  - Cache File Directory
- Configuring the Cache Size
- Cache Partitioning
  - Configuring the Cache Size for Cache Partitioning
- Aggregator Caches
- Joiner Caches
- Lookup Caches
  - Sharing Caches
  - Configuring the Cache Sizes for a Lookup Transformation
- Rank Caches
  - Configuring the Cache Sizes for a Rank Transformation
- Sorter Caches
  - Configuring the Cache Size for a Sorter Transformation
- XML Target Caches
  - Configuring the Cache Size for an XML Target
- Optimizing the Cache Size
Incremental Aggregation
- Incremental Aggregation Overview
- Integration Service Processing for Incremental Aggregation
- Reinitializing the Aggregate Files
- Moving or Deleting the Aggregate Files
  - Finding Index and Data Files
- Partitioning Guidelines with Incremental Aggregation
- Preparing for Incremental Aggregation
  - Configuring the Mapping
  - Configuring the Session
Session Log Interface
- Session Log Interface Overview
- Implementing the Session Log Interface
  - The Integration Service and the Session Log Interface
  - Rules and Guidelines for Implementing the Session Log Interface
- Functions in the Session Log Interface
- Session Log Interface Example
  - Building the External Session Log Library
    - Building the Library in UNIX
    - Building the Library in Windows
  - Using the External Session Log Library
Understanding Buffer Memory
- Understanding Buffer Memory Overview
- Automatic Buffer Memory Settings
  - Using Session Configuration Objects for Memory Configuration
- Configuring Buffer Memory
- Configuring Session Cache Memory
  - Session Cache Limits
  - Configuring Automatic Memory Settings for Session Caches
High Precision Data
- High Precision Data Overview
- Bigint
- Decimal

Advanced Workflow Guide

10.4.1
- 10.5.8
- 10.5.7
- 10.5.6
- 10.5
- 10.4.0

Back Next

Rules and Guidelines for Partitioning File Sources

Use the following rules and guidelines when you configure a file source session with multiple partitions:

Use pass-through partitioning at the source qualifier.

Use single- or multi-threaded reading with flat file or COBOL sources.

Use single-threaded reading with XML sources.

You cannot use multi-threaded reading if the source files are non-disk files, such as FTP files or WebSphere MQ sources.

If you use a shift-sensitive code page, use multi-threaded reading if the following conditions are true:

The file is fixed-width.

The file is not line sequential.

You did not enable user-defined shift state in the source definition.

To read data from the three flat files concurrently, you must specify three partitions at the source qualifier. Accept the default partition type, pass-through.

If you configure a session for multi-threaded reading, and the Integration Service cannot create multiple threads to a file source, it writes a message to the session log and reads the source with one thread.

When the Integration Service uses multiple threads to read a source file, it may not read the rows in the file sequentially. If sort order is important, configure the session to read the file with a single thread. For example, sort order may be important if the mapping contains a sorted Joiner transformation and the file source is the sort origin.

You can also use a combination of direct and indirect files to balance the load.

Session performance for multi-threaded reading is optimal with large source files. The load may be unbalanced if the amount of input data is small.

You cannot use a command for a file source if the command generates source data and the session is configured to run on a grid or is configured with the resume from the last checkpoint recovery strategy.

Rename Saved Search

Table of Contents

Advanced Workflow Guide

Advanced Workflow Guide

Rules and Guidelines for Partitioning File Sources

Rules and Guidelines for Partitioning File Sources