Table of Contents

Search

  1. Preface
  2. Understanding Pipeline Partitioning
  3. Partition Points
  4. Partition Types
  5. Pushdown Optimization
  6. Pushdown Optimization and Transformations
  7. Real-time Processing
  8. Commit Points
  9. Row Error Logging
  10. Workflow Recovery
  11. Stopping and Aborting
  12. Concurrent Workflows
  13. Grid Processing
  14. Load Balancer
  15. Workflow Variables
  16. Parameters and Variables in Sessions
  17. Parameter Files
  18. FastExport
  19. External Loading
  20. FTP
  21. Session Caches
  22. Incremental Aggregation
  23. Session Log Interface
  24. Understanding Buffer Memory
  25. High Precision Data

Advanced Workflow Guide

Advanced Workflow Guide

Rules and Guidelines for Partitioning File Sources

Rules and Guidelines for Partitioning File Sources

Use the following rules and guidelines when you configure a file source session with multiple partitions:
  • Use pass-through partitioning at the source qualifier.
  • Use single- or multi-threaded reading with flat file or COBOL sources.
  • Use single-threaded reading with XML sources.
  • You cannot use multi-threaded reading if the source files are non-disk files, such as FTP files or WebSphere MQ sources.
  • If you use a shift-sensitive code page, use multi-threaded reading if the following conditions are true:
    • The file is fixed-width.
    • The file is not line sequential.
    • You did not enable user-defined shift state in the source definition.
  • To read data from the three flat files concurrently, you must specify three partitions at the source qualifier. Accept the default partition type, pass-through.
  • If you configure a session for multi-threaded reading, and the Integration Service cannot create multiple threads to a file source, it writes a message to the session log and reads the source with one thread.
  • When the Integration Service uses multiple threads to read a source file, it may not read the rows in the file sequentially. If sort order is important, configure the session to read the file with a single thread. For example, sort order may be important if the mapping contains a sorted Joiner transformation and the file source is the sort origin.
  • You can also use a combination of direct and indirect files to balance the load.
  • Session performance for multi-threaded reading is optimal with large source files. The load may be unbalanced if the amount of input data is small.
  • You cannot use a command for a file source if the command generates source data and the session is configured to run on a grid or is configured with the resume from the last checkpoint recovery strategy.


Updated November 14, 2019