Table of Contents

Search

  1. Preface
  2. Introduction to Amazon S3 V2 Connector
  3. Connections for Amazon S3 V2
  4. Amazon S3 V2 sources and targets
  5. Mappings and mapping tasks with Amazon S3 V2
  6. Migrating a mapping
  7. Upgrading to Amazon S3 V2 Connector
  8. Data type reference
  9. Troubleshooting

Amazon S3 V2 Connector

Amazon S3 V2 Connector

Incrementally loading files

Incrementally loading files

You can incrementally load source files in a directory to read and process only the files that have changed since the last time the mapping task ran.
You can incrementally load files only from mappings in advanced mode. Ensure that all of the source files exist in the same Cloud environment.
To incrementally load source files, select
Incremental File Load
and
Directory
as the source type in the advanced read options of the Amazon S3 V2 data object.
When you incrementally load files from Amazon S3, the job loads files that have changed from the last load time to 15 minutes before the job started running. For example, if you run a job at 2:00 p.m, the job loads files changed before 1:45 p.m. The 15-minute buffer ensures that the job loads only complete files, since uploading objects on Amazon S3 can take a few minutes to complete.
When you configure a mapping task, the
Incremental File Load
section lists the Source transformations that incrementally loads files and the time that the last job completed loading the files. By default, the next job that runs checks for files modified after the last load time.
The image shows the details of incremental file load
You can also override the load time that the mapping uses to look for changed files in the specified source directory. You can reset the incremental file load settings to perform a full load of all the changed files in the directory, or you can configure a time that the mapping uses to look for changed files.
A mapping in advanced mode that incrementally load a directory that contains complex file formats such as Parquet and Avro fails if there are no new or changed files in the source since the last run.
For more information on incremental loading, see Reprocessing incrementally-loaded source files in
Tasks
in the Data Integration documentation.

0 COMMENTS

We’d like to hear from you!