Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Microsoft Azure Data Lake Storage Gen2
  3. PowerExchange for Microsoft Azure Data Lake Storage Gen2 Configuration
  4. Microsoft Azure Data Lake Storage Gen2 Connections
  5. PowerExchange for Microsoft Azure Data Lake Storage Gen2 Data Objects
  6. Microsoft Azure Data Lake Storage Gen2 Mappings
  7. Appendix A: Microsoft Azure Data Lake Storage Gen2 Datatype Reference

PowerExchange for Microsoft Azure Data Lake Storage Gen2 User Guide

PowerExchange for Microsoft Azure Data Lake Storage Gen2 User Guide

Microsoft Azure Data Lake Storage Gen2 Read Use Case

Microsoft Azure Data Lake Storage Gen2
Read Use Case

If you want to read large data sets, the task can take a long time to process. You can configure the following read operation properties to partition the source and read the partitions concurrently, which can optimize performance:
  • Block Size
    : partitions a large file or object into smaller parts each of specified block size. When reading a large file, consider partitioning a large file into smaller parts and configure
    Concurrent Threads
    to spawn required number of threads to process data in parallel.
  • Concurrent Threads
    : number of concurrent connections to read data from
    Microsoft Azure Data Lake Storage Gen2
    . When reading a large file or object, you can spawn multiple threads to process data. Default is 10.
    You must configure
    Block Size
    if you want multiple threads to process data in parallel.
The following image shows the source properties for parallel read from a large source file: The images shows read operation properties required to read data in parallel from a large source file.

0 COMMENTS

We’d like to hear from you!