Tuning the Hive Engine for Big Data Management®

Tuning the Hive Engine for Big Data Management®

Case Study: HBase Read and Write

Case Study: HBase Read and Write

This example demonstrates the HBase read and write optimization.

HBase Reader Mapping

The mapping reads data from an HBase table and writes the data to an HDFS file that contains approximately 7.5GB of data from 60 million relational records. The HBase table originally had 8 regions. When the table is split into 54 regions, the performance improves by 62%.
The following image shows the mapping:

Result

HBase Writer Mapping

The mapping reads data from an HDFS File that contains approximately 7.5GB of data from 60 million relational records. The mapping uses an Expression transformation to generate the HBase Rowkey and writes the data to an HBase table.
The following image shows the mapping:

Pre-splitting and Result

The HBase table was not originally split. When the HBase table is pre-split into 54 regions, the performance improves by 60%.
The following image shows the result for a pre-split table:

Disabling Auto Flush

When you use PowerExchange for HBase to write data to HBase and disable Auto Flush, performance improves by about 43%.
The following image shows the result after disabling auto flush:

0 COMMENTS

We’d like to hear from you!