Table of Contents

Search

  1. Abstract
  2. Supported Versions
  3. Tuning and Sizing Guidelines for Data Engineering Integration (10.4.x)

Tuning and Sizing Guidelines for Data Engineering Integration (10.4.x)

Tuning and Sizing Guidelines for Data Engineering Integration (10.4.x)

Case Study: Data Integration Service Concurrency

Case Study: Data Integration Service Concurrency

The following case study tested a large number of concurrent mappings running on a single node Data Integration Service. The mappings used TPC-DS benchmark queries of medium complexity with Hive sources and parameterized HDFS targets. During the test, peak CPU utilization was ~20 cores (80%) for less than 5 minutes. The average utilization was ~8 cores.

Environment

Chipset
Intel® Xeon® Processor X5675 @ 3.06 GHz
Cores
4 x 6 cores
Memory
32 GB
Operating system
Red Hat Enterprise Linux 6.1
Hadoop distribution
Cloudera 5.15
Hadoop cluster
25 nodes

Performance Chart

The following performance chart compares the dispatch times for 2K and 5K concurrent jobs. Dispatch time is the time taken by DIS to submit all the mappings to the cluster:

Conclusions


0 COMMENTS

We’d like to hear from you!