Hadoop Performance Tuning Options for EMR Distribution
Hadoop Performance Tuning Options for EMR Distribution
You can use Hadoop Performance Tuning Options to optimize the performance in the Amazon EMR distribution when you copy large volumes of data between Amazon S3 buckets and HDFS.
You must provide semicolon separated name-value attribute pairs for Hadoop Performance Tuning Options.
Use the following parameters for Hadoop Performance Tuning Options:
mapreduce.map.java.opts
fs.s3a.fast.upload
fs.s3a.multipartthreshold
fs.s3a.multipartsize
mapreduce.map.memory.mb
The following sample shows the recommended values for the parameter: