Table of Contents

Search

  1. Preface
  2. Introduction to Google Cloud Storage V2 Connector
  3. Google Cloud Storage V2 connections
  4. Mappings for Google Cloud Storage
  5. Migrating a mapping
  6. Upgrading to Google Cloud Storage V2 Connector
  7. Appendix A: Data type reference

Google Cloud Storage V2 Connector

Google Cloud Storage V2 Connector

Incrementally loading files for mappings in advanced mode

Incrementally loading files for mappings in advanced mode

You can incrementally load source files in a directory to read and process only the files that have changed since the last time the mapping task ran.
You can incrementally load files only from mappings in advanced mode. Ensure that all of the source files exist in the same Cloud environment.
To incrementally load source files from a Google Cloud Storage directory, select the
Is Directory
option and the
Incremental File Load
option in the advanced source properties of the Google Cloud Storage V2 source object.
When you incrementally load files from a Google Cloud Storage directory, the job loads files that have changed from the last load time to five minutes before the job started running. For example, if you run a job at 2:00 p.m, the job loads files changed before 1:55 p.m. The five-minute buffer ensures that the job loads only complete files, because uploading objects on Google Cloud Storage can take a few minutes to complete.
When you configure a mapping task, the
Incremental File Load
section lists the Source transformations that incrementally load files and the time that the last job completed loading the files. By default, the next job that runs checks for files modified after the last load time.
The following image shows the
Incremental File Load
section in the
Persisted Task Settings
page of the mapping task:
The image shows the details of incremental file load
You can also override the load time that the mapping uses to look for changed files in the specified source directory. You can reset the incremental file load settings to perform a full load of all the changed files in the directory, or you can configure a time that the mapping uses to look for changed files.
A mapping in advanced mode that incrementally loads a directory that contains complex file formats such as Parquet and Avro, the mapping fails if there are no new or changed files in the source since the last run.
For more information on incremental loading, see Reprocessing incrementally-loaded source files in
Tasks
in the Data Integration documentation.

0 COMMENTS

We’d like to hear from you!