Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Microsoft Azure Data Lake Storage Gen2
  3. PowerExchange for Microsoft Azure Data Lake Storage Gen2 Configuration
  4. Microsoft Azure Data Lake Storage Gen2 Connections
  5. PowerExchange for Microsoft Azure Data Lake Storage Gen2 Data Objects
  6. Microsoft Azure Data Lake Storage Gen2 Mappings
  7. Appendix A: Microsoft Azure Data Lake Storage Gen2 Datatype Reference

PowerExchange for Microsoft Azure Data Lake Storage Gen2 User Guide

PowerExchange for Microsoft Azure Data Lake Storage Gen2 User Guide

Directory-Level Partitioning

Directory-Level Partitioning

When you run a mapping on the Spark engine or Databricks Spark engine, you can read data from and write data to Avro, ORC, and Parquet files that are partitioned based on directories.
You must import a directory that contains only partition folders and select the source type as
Directory
in the advanced read properties.

Importing a data object with partition files

Perform the following steps to import a data object to read or write from partition files:
  1. Select a project or folder in the
    Object Explorer
    view.
  2. Click
    File
    New
    Data Object
    .
  3. Select
    Microsoft Azure Data Lake Storage Gen2 Data Object
    and click
    Next
    .
    The
    Microsoft Azure Data Lake Storage Gen2 Data Object
    dialog box appears.
  4. Click
    Browse
    and select the target project or folder.
  5. In the
    Resource Format
    list, select Avro, Parquet, or ORC.
  6. Click
    Add
    to add a resource to the data object.
    The
    Add Resource
    dialog box appears. You can use the
    File Type
    column to distinguish between a directory and a file.
    Use the file type to distinguish between a file and directory
  7. Select the check box for a directory. Click
    OK
    .
  8. Click
    Finish
    .
    The partitioned columns are displayed with the order of partitioning in the data object
    Overview
    tab.
    The partition order tab shows the order of partitions.

Create target with partition files

Perform the following steps to create target with partition files:
  1. Select a project or folder in the
    Object Explorer
    view.
  2. Select a source or a transformation in the mapping.
  3. Right-click the Source transformation and select
    Create Target
    .
    The
    Create Target
    dialog box appears.
    Create target at the run time
  4. Select
    Others
    and then select
    Microsoft Azure Data Lake Storage Gen2
    data object from the list in the
    Data Object Type
    section.
  5. Click
    OK
    .
    The
    New Microsoft Azure Data Lake Storage Gen2 Data Object
    dialog box appears.
    Create a Microsoft Azure Data Lake Storage Gen2 Data Object
  6. Enter a name for the data object.
  7. Enter the partition fields.
    The following image shows the
    Edit partition fields
    dialog box:
    You can edit the partition fields here.
  8. You can change the partition order using the up and down arrows.
  9. Click
    Finish
    .
    The partitioned columns are displayed with the order of partitioning in the data object
    Overview
    tab.

0 COMMENTS

We’d like to hear from you!