Table of Contents

Search

  1. Abstract
  2. Supported Versions
  3. Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Performance Tuning and Sizing Guidelines for Informatica® Big Data Management 10.2.2

Case Study: Traditional Update Strategy versus Hive MERGE

Case Study: Traditional Update Strategy versus Hive MERGE

The following case study compares the amount of time that an update strategy task requires to complete depending on the number of INSERT and UPDATE statements in the task and whether the task implements Hive MERGE.

Environment

Chipset
Intel® Xeon® CPU E5-2680 v3 @ 2.50GHz
Cores
12 cores
Memory
128 GB
Operating system
Red Hat Enterprise Linux 7.3
Hadoop cluster
14 nodes
Data volume
~ 75 GB

Performance Chart

The following performance chart shows a comparison of the time taken for an update strategy task to complete when the task contains a combination of INSERT and UPDATE statements:
This image shows a performance chart on update strategy tasks with and without Hive MERGE. The graph on the left shows the difference in performance for an update strategy task that is 25% INSERT statements and 75% UPDATE statements. The graph on the right shows the difference in performance for an update strategy task that is 33% INSERT statements and 66% UPDATE statements.

Conclusions

Based on the case study, a 30% increase in performance is observed in the update strategy task when the task implements Hive MERGE.
When you perform incremental updates and the percentage of UPDATE statements is higher than the percentage of INSERT statements, consider using Hive MERGE.

0 COMMENTS

We’d like to hear from you!