Table of Contents

Search

Directory Source in Amazon S3 Sources

Directory Source in Amazon S3 Sources

You can select the type of source from which you want to read data.
You can select the following type of sources from the
Source Type
option under the advanced properties for an Amazon S3 data object read operation:
  • File
  • Directory
This option is applicable when you run a mapping in the native environment or on the Spark engine.
You must select the source file during the data object creation to select the source type as
Directory
at the run time. PowerExchange for Amazon S3 provides the option to override the value of the
Folder Path
and
File Name
properties during run time. When you select the
Source Type
option as
Directory
, the value of the
File Name
is not honored.
For read operation, if you provide the
Folder Path
value during run time, the Data Integration Service considers the value of the
Folder Path
from the data object read operation properties. If you do not provide the
Folder Path
value during run time, the Data Integration Service considers the value of the
Folder Path
that you specify during the data object creation.
Use the following rules and guidelines to select
Directory
as the source type:
  • All the source files in the directory must contain the same metadata.
  • All the files must have data in the same format. For example, delimiters, header fields, and escape characters must be same.
  • All the files under a specified directory are parsed. The files under subdirectories are not parsed.
When you run a mapping to read multiple files and if the Amazon S3 data object is defined using file with header option on the Spark engine, the mapping runs successfully. However, the Data Integration Service does not generate a validation error for the files with no header.


Updated July 30, 2020