Data Engineering Integration 10.2
- Data Engineering Integration 10.2
- All Products
Transformation
| Rules and Guidelines
|
---|---|
Transformations not listed in this table are not supported. | |
Aggregator
| Mapping validation fails in the following situations:
When a mapping contains an Aggregator transformation with an input/output port that is not a group by port, the transformation might not return the last row of each group with the result of the aggregation. Hadoop execution is distributed, and thus it might not be able to determine the actual last row of each group.
|
Data Masking
| Mapping validation fails in the following situations:
|
Expression
| Mapping validation fails in the following situations:
If an expression results in numerical errors, such as division by zero or SQRT of a negative number, it returns an infinite or an NaN value. In the native environment, the expression returns null values and the rows do not appear in the output.
|
Filter
| Supported without restrictions.
|
Java
| Mapping validation fails in the following situations:
To use external .jar files in a Java transformation, perform the following steps:
To run user code directly on the Spark engine, the JDK version that the Data Integration Service uses must be compatible with the JRE version on the cluster. For best performance, create the environment variable DIS_JDK_HOME on the Data Integration Service in the Administrator tool. The environment variable contains the path to the JDK installation folder on the machine running the Data Integration Service. For example, you might enter a value such as
/usr/java/default .
The Partitionable property must be enabled in the Java transformation. The transformation cannot run in one partition.
For date/time values, the Spark engine supports the precision of up to microseconds. If a date/time value contains nanoseconds, the trailing digits are truncated.
When you enable high precision and the Java transformation contains a field that is a decimal data type, a validation error occurs.
The following restrictions apply to the Transformation Scope property:
The Java code in the transformation cannot write output to standard output when you push transformation logic to Hadoop. The Java code can write output to standard error which appears in the log files.
|
Joiner
| Mapping validation fails in the following situations:
|
Lookup
| Mapping validation fails in the following situations:
The mapping fails in the following situation:
When you choose to return the first, last, or any value on multiple matches, the Lookup transformation returns any value.
If you configure the transformation to report an error on multiple matches, the Spark engine drops the duplicate rows and does not include the rows in the logs.
|
Normalizer
| Supported without restrictions.
|
Rank
| Mapping validation fails in the following situations:
|
Router
| Supported without restrictions.
|
Sorter
| Mapping validation fails in the following situations:
The Data Integration Service logs a warning and ignores the Sorter transformation in the following situations:
The Data Integration Service treats null values as high even if you configure the transformation to treat null values as low.
|
Union
| Supported without restrictions.
|
Update Strategy
| The Update Strategy transformation is supported only on Hadoop distributions that support Hive ACID.
Mapping validation fails in the following situations:
The mapping fails in the following situations:
The Update Strategy transformation does not forward rejected rows to the next transformation.
To use a Hive target table with an Update Strategy transformation, you must create the Hive target table with the following clause in the Hive Data Definition Language:
TBLPROPERTIES ("transactional"="true") .
To use an Update Strategy transformation with a Hive target, verify that the following properties are configured in the hive-site.xml configuration set associated with the Hadoop connection:
If the Update Strategy transformation receives multiple update rows for the same primary key value, the transformation selects one random row to update the target.
If multiple Update Strategy transformations write to different instances of the same target, the target data might be unpredictable.
The Spark engine executes operations in the following order: deletes, updates, inserts. It does not process rows in the same order as the Update Strategy transformation receives them.
Hive targets always perform Update as Update operations. Hive targets do not support Update Else Insert or Update as Insert.
|
Updated December 13, 2018