Table of Contents

Search

  1. Abstract
  2. Supported Versions
  3. Tuning and Sizing Guidelines for Data Engineering Integration (10.4.x)

Tuning and Sizing Guidelines for Data Engineering Integration (10.4.x)

Tuning and Sizing Guidelines for Data Engineering Integration (10.4.x)

Case Study: Data Integration Service Concurrency with Multiple HS2 Load Balancers

Case Study: Data Integration Service Concurrency with Multiple HS2 Load Balancers

The following case study shows the benefits of having multiple Hive Server2 Load Balancers for large concurrent mappings running on a Data Integration Service 4 node grid.
The mappings used TPC-DS benchmark queries of medium complexity with Hive sources and parameterized HDFS targets. During the test, peak CPU utilization was ~20 cores (80%) for less than 5 minutes. The average utilization was ~4 cores.

Environment

Cloudera Cluster
Data Integration Service
4 Node Grid
Chipset
Intel® Xeon® Processor X5675 @ 3.06 GHz
Intel® Xeon® Gold 6132 CPU @ 2.60GHz
Cores
4 x 6 cores
4 x 14 cores
Memory
32 GB
125 GB
Operating System
Red Hat Enterprise Linux 6.1
Red Hat Enterprise Linux 7.5 (Maipo)
Hadoop Distribution
Cloudera 5.15
-
Hadoop Cluster Size
25 nodes
-

Hive Server 2 Load Balancer Configuration

Test staff configured the HiveServer 2 load balancers using the following steps:
  • Install the HA Proxy package or another load balancer recommended by your IT team.
  • Configure the HA proxy service to listen on port 10000 and include the HS2 instances.
  • Configure the HA Proxy service to start on bootup.
  • In Cloudera Manager, include the Load Balancer server address in the HiveServer2 Load Balancer configuration properties.
  • Restart the Hive service.

Performance Chart

The following performance chart compares the dispatch times for 10K concurrent jobs on a Hadoop cluster. Cluster Dispatch time is the time taken by the Data Integration Service to submit all mappings to the cluster:

Conclusions

The test found that dispatch time improved ~40% with two HiveServer 2 instances.

0 COMMENTS

We’d like to hear from you!