Table of Contents

Search

  1. Preface
  2. Introduction to Big Data Management Administration
  3. Big Data Management Engines
  4. Authentication and Authorization
  5. Running Mappings on a Cluster with Kerberos Authentication
  6. Configuring Access to an SSL/TLS-Enabled Cluster
  7. Cluster Configuration
  8. Cluster Configuration Privileges and Permissions
  9. Cloud Provisioning Configuration
  10. Queuing
  11. Tuning for Big Data Processing
  12. Connections
  13. Multiple Blaze Instances on a Cluster

Big Data Management Administrator Guide

Big Data Management Administrator Guide

Tuning the Spark Engine

Tuning the Spark Engine

Tune the Spark engine according to a deployment type that defines the big data processing requirements on the Spark engine. When you tune the Spark engine, the autotune command configures the Spark advanced properties in the Hadoop connection.
The following table describes the advanced properties that are tuned:
Property
Description
spark.driver.memory
The driver process memory that the Spark engine uses to run mapping jobs.
spark.executor.memory
The amount of memory that each executor process uses to run tasklets on the Spark engine.
spark.executor.cores
The number of cores that each executor process uses to run tasklets on the Spark engine.
spark.sql.shuffle.partitions
The number of partitions that the Spark engine uses to shuffle data to process joins or aggregations in a mapping job.
The following table lists the tuned value for each advanced property based on the deployment type:
Property
Sandbox
Basic
Standard
Advanced
spark.driver.memory
1 GB
2 GB
4 GB
4 GB
spark.executor.memory
2 GB
4 GB
6 GB
6 GB
spark.executor.cores
2
2
2
2
spark.sql.shuffle.partitions
100
400
1500
3000

0 COMMENTS

We’d like to hear from you!