Table of Contents

Search

  1. Preface
  2. Mappings
  3. Mapplets
  4. Mapping Parameters
  5. Where to Assign Parameters
  6. Mapping Outputs
  7. Generate a Mapping from an SQL Query
  8. Dynamic Mappings
  9. How to Develop and Run a Dynamic Mapping
  10. Dynamic Mapping Use Cases
  11. Mapping Administration
  12. Export to PowerCenter
  13. Import From PowerCenter
  14. Performance Tuning
  15. Pushdown Optimization
  16. Partitioned Mappings
  17. Developer Tool Naming Conventions

Developer Mapping Guide

Developer Mapping Guide

Partitioned Flat File Sources

Partitioned Flat File Sources

When a mapping that is enabled for partitioning reads from a flat file source, the Data Integration Service can use multiple threads to read the file source.
The Data Integration Service can create partitions for the following flat file source types:
  • Direct file
  • Indirect file
  • Directory of files
  • Command
  • File or directory of files in Hadoop Distributed File System (HDFS)
When the Data Integration Service uses multiple threads to read a file source, it creates multiple concurrent connections to the source. By default, the Data Integration Service does not preserve row order because it does not read the rows in the file or file list sequentially. To preserve row order when multiple threads read from a single file source, configure concurrent read partitioning.
When the Data Integration Service uses multiple threads to read a direct file, it creates multiple reader threads to read the file concurrently.
When the Data Integration Service uses multiple threads to read an indirect file or a directory of files, it creates multiple reader threads to read the files in the list or directory concurrently. The Data Integration Service might use multiple threads to read a single file. Or, the Data Integration Service might use a single thread to read multiple files in the list or directory.

0 COMMENTS

We’d like to hear from you!