Big Data Streaming User Guide

10.2.1
- 10.5.8
- 10.5.7
- 10.5.4
- 10.5.2
- 10.5.1
- 10.5
- 10.4.1
- 10.4.0
- 10.2.2 Service Pack 1
- 10.2.2

Back Next

Transformations in a Streaming Mapping

Informatica Developer provides a set of transformations that perform specific functions. Some restrictions and guidelines apply to processing transformations in a streaming mapping.

You can use the following transformations in a streaming mapping:

Aggregator: Mapping validation fails when the
The transformation contains stateful variable ports.
The transformation contains unsupported functions in an expression.
Data Masking: Supported without restrictions.
Expression: Mapping validation fails when the transformation contains unsupported functions in an expression.

If an expression results in numerical errors, such as division by zero or SQRT of a negative number, it returns an infinite or an NaN value. In the native environment, the expression returns null values and the rows do not appear in the output.
Filter: Supported without restrictions.
Java: The following restrictions apply to the Java transformation:
The value Transaction for transformation scope is not valid.
The transformation is always Stateless
The Partitionable field is ignored.; To run user code directly on the Spark engine, the JDK version that the Data Integration Service uses must be compatible with the JRE version on the cluster. For best performance, create the environment variable DIS_JDK_HOME on the Data Integration Service in the Administrator tool. The environment variable contains the path to the JDK installation folder on the machine running the Data Integration Service. For example, you might enter a value such as
/usr/java/default; You can use complex data types to process hierarchical data.
The complex data type support for the Java transformation is available for technical preview. Technical preview functionality is supported but is unwarranted and is not production-ready. Informatica recommends that you use in non-production environments only.; For more information about Java transformation support on the Spark engine, see the
Informatica Big Data Management User Guide
.
Joiner: Mapping validation fails in the following situations:
Case sensitivity is disabled.
Lookup: Use a Lookup transformation to look up data in a flat file, HDFS, Hive, or Sqoop and perform an uncached lookup on HBase data.

Mapping validation fails in the following situations:
Case sensitivity is disabled.
The lookup is a data object.

The mapping fails in the following situations:
The transformation is unconnected.

You cannot use a float data type to look up data in a Hive table as comparing equality of floating point numbers is unsafe.

When you configure the transformation to return the first, last, or any value on multiple matches, the Data Integration Service returns any value.

To use a Lookup transformation on Sqoop in a Cloudera distribution, perform the following configuration:
In the Yarn configuration, locate the property
NodeManager Advanced Configuration Snippet (Safety Valve) for mapred-site.xml

Add the following xml snippet:

<property> <name>mapreduce.application.classpath</name> <value>$HADOOP_MAPRED_HOME/,$HADOOP_MAPRED_HOME/lib/, $MR2_CLASSPATH</value> </property>

Informatica recommends that you select the
Ignore null values that match
property in Lookup transformation advanced properties to avoid cross join of DataFrames.

To use a Lookup transformation on uncached HBase tables, perform the following steps:

Create an HBase data object. When you add an HBase table as the resource for a HBase data object, include the ROW ID column.
Create a HBase read data operation and import it into the streaming mapping.
When you import the data operation to the mapping, select the
Lookup
option.
In the Lookup tab, configure the following options:
Lookup column. Specify an equality condition on ROW ID
Operator. Specify =

Verify that format for any date value in the HBase tables is of a valid Java date format. Specify this format in the
Date Time Format
property of the
Advanced Properties
tab of the data object read operation.

Mapping validation fails in the following situations:
If you do not include ROW ID in the condition
If you specify any operator other than =
If you include multiple conditions in the transformation.
If you select column of type date from input columns.
If you look up binary data.
Normalizer: Supported without restrictions.
Python: Supported without restrictions.
The Python transformation is available for technical preview. Technical preview functionality is supported but is unwarranted and is not production-ready. Informatica recommends that you use in non-production environments only.
Rank: Mapping validation fails if case-sensitivity is disabled.
Router: Supported without restrictions.
Sorter: To use the Sorter transformation in a Streaming mapping, configure the following properties:

Advanced properties of the data object write properties. Enable the
Maintain Row Order
field.
Custom properties of the Data Integration Service. Set the
ExecutionContextOptions.Infa.HonorTargetOrdering
property to true if there are one or more transformations between the Sorter transformation and the target.

Mapping validation fails in the following situations:
Case sensitivity is disabled.

The Data Integration Service logs a warning and ignores the Sorter transformation in the following situations:
There is a type mismatch in between the target and the Sorter transformation sort keys.
The transformation contains sort keys that are not connected to the target.
The transformation is not directly upstream from the Write transformation.

The Data Integration Service treats null values as high even if you configure the transformation to treat null values as low.
Union: Supported without restrictions.
Window: Supported without restrictions.
See the Window Transformation chapter in this guide for more information.