Components

Back Next

File event reliability

When you use a file listener as a source in a file ingestion task, it creates a file event based on the file listener configuration when new files arrive, when the existing files are updated, or when the files are deleted. The file events are passed to the file ingestion task. This section explains the reliability aspects of handling these file events between a file listener and a file ingestion task.

The file listener handles the events based on the following conditions:

If the Secure Agent isn't running or there is a temporary network disruption, and file events don't reach the file ingestion task, the file listener queues the events for each file and includes it in the notification of the next file ingestion job. A file ingestion task thus receives a notification about each file at least once. This ensures at-least-once reliability between the file listener and the file ingestion task.

File events that aren't processed remain in the queue for seven days.

If multiple events occur, the file listener notifies the file ingestion task with only the last event for each file.

File events that are in the file listener queue reach the file ingestion task by one of the following methods:

When a file ingestion job completes, the mass ingestion service makes a pull request to the file listener to check for any queued events. If it finds any events, the service triggers a new ingestion job to process them. The pull request doesn't trigger the processing of files that are already assigned to another concurrent job that runs by the same ingestion task, so only one ingestion job processes a file at any time.

If any events aren't picked up by the proactive pull request, for example, if the Secure Agent isn't running when the mass ingestion service makes the request, the file listener queues the last event for each file and includes it in the notification of the next file ingestion job.

You can also run the file ingestion task manually to pull the failed events.

When a file event processing fails, the file ingestion task retries to process the failed events. Retry of failed events occurs once automatically and during subsequent file listener notifications.

The file ingestion task doesn't automatically reprocess file events that are in success or duplicate status.

You need to manually identify files that aren't successfully transferred to the target due to an error, for example, by using the transfer logs. To resolve the problem, either move the files or modify them manually, so that the file listener picks them up. For example, if the last modified time of a file changes, the file listener identifies the file as updated even if the contents haven't changed.

Example

A file listener is a source in a file ingestion task with 15 file events to transfer to a target. The batch size is five. When the file ingestion task is triggered and complete, the file events are in the following status:

Five events in the first batch (file 1 to 5): success

Five events in the second batch (file 6 to 10): failed

Five events in the third batch (file 11 to 15): unprocessed

The file ingestion task automatically retries to process the five failed and unprocessed events once. When the file ingestion task is complete, the file events are in the following status:

Five events in the first batch (file 6 to 10): success

Five events in the second batch (file 11 to 15): failed

The file ingestion task automatically retries to process the five failed events once. When the file ingestion task is complete, the five events in the second batch (file 11 to 15) fails.

You can manually run the file ingestion task to pull the pending five events. If you don't run the file ingestion task manually, the file listener would include the failed events in the notification of the next file ingestion job.

File listeners in file ingestion tasks

Download Guide

Watch

Comments

Cloud Data Integration Homepage