Table of Contents

Search

Hadoop Performance Tuning Options for EMR Distribution

Hadoop Performance Tuning Options for EMR Distribution

You can use Hadoop Performance Tuning Options to optimize the performance in the Amazon EMR distribution when you copy large volumes of data between Amazon S3 buckets and HDFS.
You must provide semicolon separated name-value attribute pairs for Hadoop Performance Tuning Options.
Use the following parameters for Hadoop Performance Tuning Options:
  • mapreduce.map.java.opts
  • fs.s3a.fast.upload
  • fs.s3a.multipartthreshold
  • fs.s3a.multipartsize
  • mapreduce.map.memory.mb
The following sample shows the recommended values for the parameter:
mapreduce.map.java.opts=-Xmx4096m;fs.s3a.fast.upload=true;fs.s3a.multipart.threshold=33554432;fs.s3a.multipart.size=33554432;mapreduce.map.memory.mb=4096


Updated July 30, 2020