Rules and Guidelines for Partitioning File Sources
Rules and Guidelines for Partitioning File Sources
Use the following rules and guidelines when you configure a file source session with multiple partitions:
Use pass-through partitioning at the source qualifier.
Use single- or multi-threaded reading with flat file or COBOL sources.
Use single-threaded reading with XML sources.
You cannot use multi-threaded reading if the source files are non-disk files, such as FTP files or WebSphere MQ sources.
If you use a shift-sensitive code page, use multi-threaded reading if the following conditions are true:
The file is fixed-width.
The file is not line sequential.
You did not enable user-defined shift state in the source definition.
To read data from the three flat files concurrently, you must specify three partitions at the source qualifier. Accept the default partition type, pass-through.
If you configure a session for multi-threaded reading, and the Integration Service cannot create multiple threads to a file source, it writes a message to the session log and reads the source with one thread.
When the Integration Service uses multiple threads to read a source file, it may not read the rows in the file sequentially. If sort order is important, configure the session to read the file with a single thread. For example, sort order may be important if the mapping contains a sorted Joiner transformation and the file source is the sort origin.
You can also use a combination of direct and indirect files to balance the load.
Session performance for multi-threaded reading is optimal with large source files. The load may be unbalanced if the amount of input data is small.
You cannot use a command for a file source if the command generates source data and the session is configured to run on a grid or is configured with the resume from the last checkpoint recovery strategy.