Table of Contents

Search

  1. Preface
  2. Introduction to Amazon S3 V2 Connector
  3. Connections for Amazon S3 V2
  4. Amazon S3 V2 sources and targets
  5. Mappings and mapping tasks with Amazon S3 V2
  6. Migrating a mapping
  7. Upgrading to Amazon S3 V2 Connector
  8. Data type reference
  9. Troubleshooting

Amazon S3 V2 Connector

Amazon S3 V2 Connector

Amazon S3 V2 sources in mappings

Amazon S3 V2 sources in mappings

In a mapping, you can configure a Source transformation to represent an Amazon S3 V2 object as the source to read data from Amazon S3.
The following table describes the Amazon S3 V2 source properties that you can configure in a source transformation:
Property
Description
Connection Name
Name of the Amazon S3 V2 source connection. Select a source connection or click
New Parameter
to define a new parameter for the source connection.
If you want to overwrite the parameter at runtime, select the
Allow parameter to be overridden at run time
option when you create a parameter. When the task runs, the agent uses the parameters from the file that you specify in the task advanced session properties.
Source Type
Source type. Select one of the following types:
  • Single Object
  • Parameter. Select Parameter to define the source type when you configure the mapping task.
Object
Name of the source object.
When you select an object, you can also select a
.manifest
file object when you want to read from multiple files.
Parameter
Select an existing parameter for the source object or click
New Parameter
to define a new parameter for the source object. The Parameter property appears only if you select Parameter as the source type.
If you want to overwrite the parameter at runtime, select the
Allow parameter to be overridden at run time
option when you create a parameter. When the task runs, the agent uses the parameters from the file that you specify in the task advanced session properties.
Format
Specifies the file format that the Amazon S3 V2 Connector uses to read data from Amazon S3.
You can select the following file format types:
  • None
    1
  • Flat
  • Avro
  • ORC
  • Parquet
  • JSON
    2
  • Delta
    1
  • Discover Structure
    2
Default is
None
. If you select
None
as the format type, the Secure Agent reads data from Amazon S3 files in binary format.
You cannot use parameterized sources when you select the discover structure format.
Open the
Formatting Options
dialog box to define the format of the file.
For more information, see File formatting options.
Intelligent Structure Model
2
Applies to Discover Structure format type. Determines the underlying patterns in a sample file and auto-generates a model for files with the same data and structure.
Select one of the following options to associate a model with the transformation:
  • Select. Select an existing model.
  • New. Create a new model. Select
    Design New
    to create the model. Select
    Auto-generate from sample file for Intelligent Structure Discovery
    to generate a model based on sample input that you select.
Select one of the following options to validate the XML source object against an XML-based hierarchical schema:
  • Source object doesn't require validation.
  • Source object requires validation against a hierarchical schema. Select to validate the XML source object against an existing or a new hierarchical schema.
When you create a mapping task, on the
Runtime Options
tab, you configure how Data Integration handles the schema mismatch. You can choose to skip the mismatched files and continue to run the task or stop the task when the task encounters the first file that does not match.
For more information, see
Components
.
1
Doesn't apply to mappings in advanced mode.
2
Applies only to mappings in advanced mode.
The following table describes the advanced source properties:
Property
Description
Source Type
Type of the source from which you want to read data.
You can select the following source types:
  • File
  • Directory
Default is
File
.
Directory source type doesn't apply to Delta files.
For more information, see Source types in Amazon S3 V2 sources.
Folder Path
Overwrites the bucket name or folder path of the Amazon S3 source file.
If applicable, include the folder name that contains the source file in the
<bucket_name>/<folder_name>
format.
If you do not provide the bucket name and specify the folder path starting with a slash (/) in the
/<folder_name>
format, the folder path appends with the folder path that you specified in the connection properties.
For example, if you specify the
/<dir2>
folder path in this property and
<my_bucket1>/<dir1>
folder path in the connection property, the folder path appends with the folder path that you specified in the connection properties in
<my_bucket1>/<dir1>/<dir2>
format.
If you specify the
<my_bucket1>/<dir1>
folder path in the connection property and
<my_bucket2>/<dir2>
folder path in this property, the Secure Agent reads the file in the
<my_bucket2>/<dir2>
folder path that you specify in this property.
File Name
Overwrites the Amazon S3 source file name.
Incremental File Load
2
Indicates whether you want to incrementally load files when you use a directory as the source for a mapping in advanced mode. When you incrementally load files, the mapping task reads and processes only files in the directory that have changed since the mapping task last ran.
For more information, see Incrementally loading files.
Allow Wildcard Characters
2
Indicates whether you want to use wildcard characters for the directory source type.
If you select this option, you can use the question mark (?) and asterisk (*) wildcard characters in the folder path or file name.
For more information, see Wildcard characters.
Recursive Directory Read
2
Indicates whether you want to read flat, Avro, JSON, ORC, or Parquet files recursively from the specified folder and its subfolders and files. Applicable when you select the directory source type.
For more information, see Recursively read files from directories.
Encryption Type
Method you want to use to decrypt data.
You can select one of the following encryption types:
  • None
  • Informatica encryption
Default is
None
.
You cannot select client-side encryption, server-side encryption, and server-side encryption with KMS encryption types.
Staging Directory
1
Path of the local staging directory.
Ensure that the user has write permissions on the directory. In addition, ensure that there is sufficient space to enable staging of the entire file. Default staging directory is the
/temp
directory on the machine that hosts the Secure Agent.
When you specify the directory path, the Secure Agent create folders depending on the number of partitions that you specify in the following format:
InfaS3Staging<00/11><timestamp>_<partition number>
where, 00 represents read operation and 11 represents write operation.
For example,
InfaS3Staging000703115851268912800_0
.
The temporary files are created within the new directory.
The staging directory source property does not apply to Avro, ORC, Parquet, and Delta files.
Hadoop Performance Tuning Options
This property is not applicable for Amazon S3 V2 Connector.
Compression Format
Decompresses data when you read data from Amazon S3.
You can choose to decompress data in the following formats:
  • None
  • Bzip2
    2
  • Gzip
  • Lzo
Default is
None
.
You can decompress data for a mapping in advanced mode if the mapping reads data from a JSON file in Bzip2 format.
Amazon S3 V2 Connector does not support the Lzo compression format even though the option appears in this property.
Download Part Size
1
Downloads the part size of an Amazon S3 object in bytes.
Default is 5 MB. Use this property when you run a mapping to read a file of flat format type.
This property applies only to flat files.
Multiple Download Threshold
1
Minimum threshold size to download an Amazon S3 object in multiple parts.
To download the object in multiple parts in parallel, ensure that the file size of an Amazon S3 object is greater than the value you specify in this property. Default is 10 MB.
This property applies only to flat files.
Temporary Credential Duration
The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds.
Default is 900 seconds.
If you require more than 900 seconds, you can set the time duration maximum up to 12 hours in the AWS console and then enter the same time duration in this property.
Tracing Level
This property is not applicable for Amazon S3 V2 Connector.
1
Doesn't apply to mappings in advanced mode.
2
Applies only to mappings in advanced mode.

0 COMMENTS

We’d like to hear from you!