PowerExchange for Amazon S3 User Guide

10.5.9
- 10.5.8
- 10.5.7
- 10.5.6
- 10.5.5
- 10.5.4
- 10.5.3
- 10.5.2
- 10.5.1
- 10.5
- 10.4.1
- 10.4.0

Back Next

Directory-Level Partitioning

When you run mappings on the Databricks or Spark engines, you can read data from the following file types:

Avro

Flat

ORC

Paraquet

Intelligent Structure Model (XML)

Additionally, you can write data to the following file types:

Avro

ORC

Paraquet

Intelligent Structure Model (XML)

Importing a data object with partition files

Perform the following steps to import a data object to read or write from partition files:

Select a project or folder in the

Object Explorer

view.

Click

File

New

Data Object

Select

AmazonS3 Data Object

and click

The

AmazonS3 Data Object

dialog box appears.

Click

Browse

next to the

Location

option and select the target project or folder.

In the

Resource Format

list, select Avro, Flat, ORC, Parquet, or Intelligent Structure Model or Sample File from the available list.

To add a resource to the data object, click

Add

next to the

Selected Resource

option.

You can use the File Type
column to distinguish between a directory and a file.

The

Add Resource

dialog box appears.

The following image shows the Add resource dialog box where you can select the file name and directory:

The File Type tab showing files and directories.

Select the check box for a directory. Click OK
.

When you select

Resource Format

as CSV, you can configure the following format properties and preview the flat file object:

Property	Descritption
Delimiters	Character used to separate columns of data. If you enter a delimiter that is the same as the escape character or the text qualifier, you might receive unexpected results. Amazon S3 reader and writer support Delimiters. You cannot specify a multibyte character as a delimiter.
Text Qualifier	Quote character that defines the boundaries of text strings. If you select a quote character, the Developer tool ignores delimiters within pairs of quotes. Amazon S3 reader supports Text Qualifier.
Import Column Names From First Line	If selected, the Developer tool uses data in the first row for column names. Select this option if column names appear in the first row. The Developer tool prefixes"FIELD_" to field names that are not valid. Amazon S3 reader and writer support Import Column Names From First Line.
Row Delimiter	Specify a line break character. Select from the list or enter a character. Preface an octal code with a backslash (\). To use a single character, enter the character. The Data Integration Service uses only the first character when the entry is not preceded by a backslash. The character must be a single-byte character, and no other character in the code page can contain that byte.Default is line-feed, \012 LF (\n).
Escape Character	Character immediately preceding a column delimiter character embedded in an unquoted string, or immediately preceding the quote character in a quoted string. When you specify an escape character, the Data Integration Service reads the delimiter character as a regular character.

The Start import at line, Treat consecutive delimters as one, and Retain escape character in data properties in the Column Projection dialog box are not applicable for PowerExchange for Amazon S3.

Click

to preview the flat file data object.

Click

Finish

The partitioned columns are displayed with the order of partitioning in the data object

Overview

tab.

The following image shows the data object overview tab: