Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

Secure Agent Best Practices and Tuning Guidelines

Secure Agent Best Practices and Tuning Guidelines

Tuning Amazon S3 data sources

Tuning Amazon S3 data sources

If you use tasks that read data from Amazon S3 data sources, you can tune some Amazon S3 parameters for better performance. Configure the staging folder for Amazon S3 data files and the multipart download threshold size parameters.

Staging folder for Amazon S3 data files

The reader for Amazon S3 downloads files in chunks by staging the files in a temporary staging folder. By default, the folder for staging data is
/tmp
. Ensure that adequate disk storage is available for the
/tmp
folder.
The default staging folder for Amazon S3 can also be overridden in the source object in the mapping. If there are multiple disks present on the Secure Agent machine, it is a good practice to distribute the staging folder across multiple disks. This helps reduce the disk bottleneck for Amazon S3 staging and makes staging faster, which improves performance.
For example, if your machine has four disks /disk1, /disk2, /disk3, and /disk4, then the staging can be configured as /disk1, /disk2, /disk3, /disk4 for different Amazon S3 data sources.

Multipart download threshold size

The Amazon S3 reader downloads source files in small chunks and merges them during task execution. The default values for Download Part Size (5MB) and Multipart Download Threshold (10MB) are sometimes insufficient for processing large Amazon S3 data sources. This can cause mapping execution to fail sporadically with an error such as, "Unable to download the files from S3."
Set these parameters to a large value such as 512MB. You can override the parameter values in the advanced source properties in the Source transformation in the mapping.
The following image shows the properties:
The image shows the Download Part Size and Multipart Download Threshold properties. Download Part Size is set to 512242880 bytes. Multipart Download Threshold is set to 512485760 bytes.

0 COMMENTS

We’d like to hear from you!