Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Amazon S3
  3. PowerExchange for Amazon S3 Configuration Overview
  4. Amazon S3 Connections
  5. PowerExchange for Amazon S3 Data Objects
  6. PowerExchange for Amazon S3 Mappings
  7. PowerExchange for Amazon S3 Lookups
  8. Appendix A: Amazon S3 Data Type Reference
  9. Appendix B: Troubleshooting

PowerExchange for Amazon S3 User Guide

PowerExchange for Amazon S3 User Guide

Directory-Level Partitioning

Directory-Level Partitioning

When you run mappings on the Databricks or Spark engines, you can read data from the following file types:
  • Avro
  • Flat
  • ORC
  • Paraquet
  • Intelligent Structure Model (XML)
Additionally, you can write data to the following file types:
  • Avro
  • ORC
  • Paraquet
  • Intelligent Structure Model (XML)

Importing a data object with partition files

Perform the following steps to import a data object to read or write from partition files:
  1. Select a project or folder in the
    Object Explorer
    view.
  2. Click
    File
    New
    Data Object
    .
  3. Select
    AmazonS3 Data Object
    and click
    Next
    .
    The
    AmazonS3 Data Object
    dialog box appears.
  4. Click
    Browse
    next to the
    Location
    option and select the target project or folder.
  5. In the
    Resource Format
    list, select Avro, Flat, ORC, Parquet, or Intelligent Structure Model or Sample File from the available list.
  6. To add a resource to the data object, click
    Add
    next to the
    Selected Resource
    option.
    You can use the
    File Type
    column to distinguish between a directory and a file.
    The
    Add Resource
    dialog box appears.
  7. The following image shows the Add resource dialog box where you can select the file name and directory:
    The File Type tab showing files and directories.
  8. Select the check box for a directory. Click
    OK
    .
  9. When you select
    Resource Format
    as CSV, you can configure the following format properties and preview the flat file object:
    Property
    Descritption
    Delimiters
    Character used to separate columns of data. If you enter a delimiter that is the same as the escape character or the text qualifier, you might receive unexpected results. Amazon S3 reader and writer support Delimiters. You cannot specify a multibyte character as a delimiter.
    Text Qualifier
    Quote character that defines the boundaries of text strings. If you select a quote character, the Developer tool ignores delimiters within pairs of quotes. Amazon S3 reader supports Text Qualifier.
    Import Column Names From First Line
    If selected, the Developer tool uses data in the first row for column names. Select this option if column names appear in the first row. The Developer tool prefixes"FIELD_" to field names that are not valid. Amazon S3 reader and writer support Import Column Names From First Line.
    Row Delimiter
    Specify a line break character. Select from the list or enter a character. Preface an octal code with a backslash (\). To use a single character, enter the character. The Data Integration Service uses only the first character when the entry is not preceded by a backslash. The character must be a single-byte character, and no other character in the code page can contain that byte.Default is line-feed, \012 LF (\n).
    Escape Character
    Character immediately preceding a column delimiter character embedded in an unquoted string, or immediately preceding the quote character in a quoted string. When you specify an escape character, the Data Integration Service reads the delimiter character as a regular character.
    The Start import at line, Treat consecutive delimters as one, and Retain escape character in data properties in the Column Projection dialog box are not applicable for PowerExchange for Amazon S3.
    Click
    Next
    to preview the flat file data object.
  10. Click
    Finish
    .
    The partitioned columns are displayed with the order of partitioning in the data object
    Overview
    tab.
    The following image shows the data object overview tab:
    The partition order tab shows the order of partitions.

Create target with partition files

Perform the following steps to create target with partition files:
  1. Select a project or folder in the
    Object Explorer
    view.
  2. Select a source or a transformation in the mapping.
  3. Right-click the Source transformation and select
    Create Target
    .
    The
    Create Target
    dialog box appears.
    The following image shows the
    Create Target
    option:
    You can view the option.Create Target
  4. Select
    Others
    and then select
    AmazonS3
    data object from the list in the
    Data Object Type
    section.
  5. Click
    OK
    .
    The
    New AmazonS3 Data Object
    dialog box appears.
    The following image shows the
    New AmazonS3 Data Object
    dialog box:
    The New AmazonS3 Data Object dialog box.
  6. Enter a name for the data object.
  7. Enter the partition fields.
    The following image shows the
    Edit partition fields
    dialog box:
    You can edit the partition firlds here.
  8. You can change the partition order using the up and down arrows.
    The following image shows the partitioned fields after changing the order:
    changed order of the partition fields.
  9. Click
    Finish
    .
    The partitioned columns are displayed with the order of partitioning in the data object
    Overview
    tab.

0 COMMENTS

We’d like to hear from you!