Table of Contents

Search

  1. Abstract
  2. Supported Versions
  3. Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Tune the Spark Engine

Tune the Spark Engine

When you develop mappings in the Developer tool to run on the Spark engine, consider the following prerequisites, tuning recommendations, and performance best practices.
Meet the following prerequisites:
  • On the Hadoop cluster, configure the Spark History Server.
  • On the Hadoop cluster, enable the Spark Shuffle Service.
  • To run mappings on the Spark engine, configure the Hadoop connection with the location of the Spark HDFS staging directory and the Spark event log directory.
    Use the same directory as the event log directory from which the Spark History Server is reading. The Spark Event Log Directory is the base directory that logs Spark events. Within this base directory, Spark creates a subdirectory for each application, and logs the events specific to the application in this directory.
For more information about configuring Spark History Server and Spark Shuffle Service, refer to the Hadoop distribution documentation or the Apache Spark documentation.
For more information about configuring the Hadoop connection, refer to the
Informatica Big Data Management User Guide
.

0 COMMENTS

We’d like to hear from you!