Strategies for Incremental Updates on Hive in Big Data Management 10.2

Strategies for Incremental Updates on Hive in Big Data Management 10.2

Approach 1. Update Strategy Transformation

Approach 1. Update Strategy Transformation

You can use an Update Strategy transformation to update Hive ACID tables. You can define expressions in an Update Strategy transformation with IIF or DECODE functions to set rules for updating rows. For example, the following IIF function detects and marks a row for reject if the entry date is after the apply date. Otherwise, the function marks the row for update:
IIF ((ENTRY_DATE > APPLY_DATE), DD_REJECT, DD_UPDATE)
The following image shows the Update Strategy transformation with an IIF function:
This image shows an Update Strategy transformation after a filter transformation. The above IIF statement is defined in the function editor.
When using the Update Strategy transformation, the following restrictions apply to Cloudera and Amazon EMR distributions:
  • Cloudera CDH. Discourages using ORC file format, which is a prerequisite for Hive transactions.
  • Amazon EMR. Due to the limitation in HIVE-17221, the Update Strategy transformation on transaction enabled partitioned Hive tables fails.
Therefore, you must use an alternative strategy to update Hive tables without enabling Hive transactions.
For more information about the Update Strategy transformation, see the
Informatica 10.2.1 Big Data Management User Guide
.

0 COMMENTS

We’d like to hear from you!