Tuning the Hive Engine for Big Data Management®

Tuning the Hive Engine for Big Data Management®

Case Study: Storage Handler Improvements

Case Study: Storage Handler Improvements

Effective in version 9.6.0, Informatica storage handler includes optimizations that can result in better performance.
When running the mapping with HDFS as source, the storage handler can be used against Hive when a Hive table is used as a source.
The following figure shows the mapping with an Aggregator transformation:
The mapping reads a 750 GB file, applies an expression to each row of data, and stores the aggregated result in a file.
To test performance, the mapping is run with the Informatica storage handler that uses HDFS as source and target. The mapping is also run with the Hive storage handler that uses Hive as source and target.

Result

The performance for Informatica's storage handler was approximately 56 percent faster than Hive's storage handler.

0 COMMENTS

We’d like to hear from you!