You can incrementally load source files in a directory to read and process only the files that have changed since the last time the mapping task ran.
You can incrementally load files only from mappings in advanced mode. Ensure that all of the
source files exist in the same Cloud environment. When you use discover structure and
document file format in a mapping in advanced mode, you can incrementally load source
files from a directory to read and process only the files that have changed since the
last time the mapping ran.
To incrementally load source files, select
Incremental File Load
and
Directory
as the source type in the advanced read options of the Microsoft Azure Data Lake Storage Gen2 data object.
When you incrementally load files from Microsoft Azure Data Lake Storage Gen2, the job loads files that have changed from the last load time to five minutes before the job started running. For example, if you run a job at 2:00 p.m, the job loads files changed before 1:55 p.m. The five-minute buffer ensures that the job loads only complete files because uploading objects on Microsoft Azure Data Lake Storage Gen2 can take a few minutes to complete.
When you configure a mapping task, the
Incremental File Load
section lists the Source transformations that will incrementally load files and the time that the last job completed loading the files. By default, the next job that runs checks for files modified after the last load time.
You can also override the load time that the mapping uses to look for changed files in the specified source directory. You can reset the incremental file load settings to perform a full load of all the changed files in the directory, or you can configure a time that the mapping uses to look for changed files.
A mapping in advanced mode that incrementally loads a directory that contains complex file
format such as JSON fails if there are no new or changed files in the source since the
last run.
For more information on incremental loading, see Reprocessing incrementally-loaded source files in