Table of Contents

Search

  1. Abstract
  2. Supported Versions
  3. Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Joiner Transformation

Joiner Transformation

You can optimize Joiner transformations to enable the Spark engine to efficiently perform a full outer join.
To increase memory for a full outer join and to determine shuffle partitions, perform the following two-step tuning process:
  1. Ensure every executor core has at least 3 GB of memory.
    For example, set spark.executor.memory=6 GB and spark.executor.cores=2.
  2. Set spark.sql.shuffle.partitions = <master splits> + <detailed partitions>.
    The spark.sql.shuffle.partitions property determines the number of partitions to use when shuffling data for joins or aggregations.
    For example, with a DFS block size of 256 MB, 100 GB of master data will have 400 splits and 200 GB of details will have 800 partitions.

0 COMMENTS

We’d like to hear from you!