The Spark engine and the Data Integration Service process overflow values differently. The Spark engine processing rules might differ from the rules that the Data Integration Service uses. As a result, mapping results can vary between the native and Hadoop environment when the Spark engine processes an overflow.
Consider the following processing variation for Spark:
If an expression results in numerical errors, such as division by zero or SQRT of a negative number, it returns an infinite or an NaN value. In the native environment, the expression returns null values and the rows do not appear in the output.
The Spark engine and the Data Integration Service process data type conversions differently. As a result, mapping results can vary between the native and Hadoop environment when the Spark engine performs a data type conversion. Consider the following processing variations for Spark:
The Spark engine ignores the scale argument of the TO_DECIMAL function. The function returns a value with the same scale as the input value.
When the scale of a double or decimal value is smaller than the configured scale, the Spark engine trims the trailing zeros.
The Spark engine cannot process dates to the nanosecond. It can return a precision for date/time data up to the microsecond.
The Hadoop environment treats "/n" values as null values. If an aggregate function contains empty or NULL values, the Hadoop environment includes these values while performing an aggregate calculation.
Mapping validation fails if you configure SYSTIMESTAMP with a variable value, such as a port name. The function can either include no argument or the precision to which you want to retrieve the timestamp value.
The UUID4 function is supported only when used as an argument in UUID_UNPARSE or ENC_BASE64.
The UUID_UNPARSE function is supported only when the argument is UUID4( ).