Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

Table of Contents

Search

  1. Preface
  2. Introduction to Microsoft Azure Data Lake Storage Gen2 Connector
  3. Connections for Microsoft Azure Data Lake Storage Gen2
  4. Mappings for Microsoft Azure Data Lake Storage Gen2
  5. Migrating a mapping
  6. Data type reference
  7. Troubleshooting

Microsoft Azure Data Lake Storage Gen2 Connector

Microsoft Azure Data Lake Storage Gen2 Connector

Target partitioning

Target partitioning

You can configure partitioning to optimize the mapping performance at run time when you write data to Microsoft Azure Data Lake Storage Gen2. You can configure target partitioning only in mappings.
The partition type controls how the agent distributes data among partitions at partition points. With partitioning, the Secure Agent distributes rows of target data based on the number of threads that you define as partition.
For example, if there are three partitions in the source, the Secure Agent writes separate files for each partition in the Microsoft Azure Data Lake Storage Gen2 target in the following format:
<target> <target_1> <target_2>
Consider the following rules and guidelines for target partitioning:
  • When you read from a directory with multiple partitions and configure target partitioning, the partition files are written to the target based on the number of partitions in the source. However, if you change the partitions in the source and run the mapping task again, ensure that you verify the existing partition files to avoid inconsistent data in the target.
  • When you read data with multiple partitions and configure target partitioning, ensure that the target file name is unique and does not match the _part file name from any of the previous mapping runs. Otherwise, the target file might contain inconsistent data.
  • You can use the append write strategy only for flat files.
  • When you read from a parquet file and write to partitions in a Microsoft Azure Data Lake Storage Gen2 target, the recommended heap size for each partition is 0.5 GB.

0 COMMENTS

We’d like to hear from you!