Table of Contents

Search

  1. Preface
  2. Introduction to Big Data Streaming
  3. Big Data Streaming Configuration
  4. Sources in a Streaming Mapping
  5. Targets in a Streaming Mapping
  6. Streaming Mappings
  7. Window Transformation
  8. Connections
  9. Data Type Reference
  10. Sample Files

Big Data Streaming User Guide

Big Data Streaming User Guide

Transformations in a Streaming Mapping

Transformations in a Streaming Mapping

Informatica Developer provides a set of transformations that perform specific functions. Some restrictions and guidelines apply to processing transformations in a streaming mapping.
You can use the following transformations in a streaming mapping:
Aggregator
Mapping validation fails when the
  • The transformation contains stateful variable ports.
  • The transformation contains unsupported functions in an expression.
Data Masking
Supported without restrictions.
Expression
Mapping validation fails when the transformation contains unsupported functions in an expression.
If an expression results in numerical errors, such as division by zero or SQRT of a negative number, it returns an infinite or an NaN value. In the native environment, the expression returns null values and the rows do not appear in the output.
Filter
Supported without restrictions.
Java
The following restrictions apply to the Java transformation:
  • The value Transaction for transformation scope is not valid.
  • The transformation is always Stateless
  • The Partitionable field is ignored.
To run user code directly on the Spark engine, the JDK version that the Data Integration Service uses must be compatible with the JRE version on the cluster. For best performance, create the environment variable DIS_JDK_HOME on the Data Integration Service in the Administrator tool. The environment variable contains the path to the JDK installation folder on the machine running the Data Integration Service. For example, you might enter a value such as
/usr/java/default
You can use complex data types to process hierarchical data.
The complex data type support for the Java transformation is available for technical preview. Technical preview functionality is supported but is unwarranted and is not production-ready. Informatica recommends that you use in non-production environments only.
For more information about Java transformation support on the Spark engine, see the
Informatica Big Data Management User Guide
.
Joiner
Mapping validation fails in the following situations:
  • Case sensitivity is disabled.
Lookup
Use a Lookup transformation to look up data in a flat file, HDFS, Hive, or Sqoop and perform an uncached lookup on HBase data.
Mapping validation fails in the following situations:
  • Case sensitivity is disabled.
  • The lookup is a data object.
The mapping fails in the following situations:
  • The transformation is unconnected.
You cannot use a float data type to look up data in a Hive table as comparing equality of floating point numbers is unsafe.
When you configure the transformation to return the first, last, or any value on multiple matches, the Data Integration Service returns any value.
To use a Lookup transformation on Sqoop in a Cloudera distribution, perform the following configuration:
  1. In the Yarn configuration, locate the property
    NodeManager Advanced Configuration Snippet (Safety Valve) for mapred-site.xml
  2. Add the following xml snippet:
    <property> <name>mapreduce.application.classpath</name> <value>$HADOOP_MAPRED_HOME/,$HADOOP_MAPRED_HOME/lib/, $MR2_CLASSPATH</value> </property>
Informatica recommends that you select the
Ignore null values that match
property in Lookup transformation advanced properties to avoid cross join of DataFrames.
To use a Lookup transformation on uncached HBase tables, perform the following steps:
  1. Create an HBase data object. When you add an HBase table as the resource for a HBase data object, include the ROW ID column.
  2. Create a HBase read data operation and import it into the streaming mapping.
  3. When you import the data operation to the mapping, select the
    Lookup
    option.
  4. In the Lookup tab, configure the following options:
    • Lookup column. Specify an equality condition on ROW ID
    • Operator. Specify =
  5. Verify that format for any date value in the HBase tables is of a valid Java date format. Specify this format in the
    Date Time Format
    property of the
    Advanced Properties
    tab of the data object read operation.
Mapping validation fails in the following situations:
  • If you do not include ROW ID in the condition
  • If you specify any operator other than =
  • If you include multiple conditions in the transformation.
  • If you select column of type date from input columns.
  • If you look up binary data.
Normalizer
Supported without restrictions.
Python
Supported without restrictions.
The Python transformation is available for technical preview. Technical preview functionality is supported but is unwarranted and is not production-ready. Informatica recommends that you use in non-production environments only.
Rank
Mapping validation fails if case-sensitivity is disabled.
Router
Supported without restrictions.
Sorter
To use the Sorter transformation in a Streaming mapping, configure the following properties:
  • Advanced properties of the data object write properties. Enable the
    Maintain Row Order
    field.
  • Custom properties of the Data Integration Service. Set the
    ExecutionContextOptions.Infa.HonorTargetOrdering
    property to true if there are one or more transformations between the Sorter transformation and the target.
Mapping validation fails in the following situations:
  • Case sensitivity is disabled.
The Data Integration Service logs a warning and ignores the Sorter transformation in the following situations:
  • There is a type mismatch in between the target and the Sorter transformation sort keys.
  • The transformation contains sort keys that are not connected to the target.
  • The transformation is not directly upstream from the Write transformation.
The Data Integration Service treats null values as high even if you configure the transformation to treat null values as low.
Union
Supported without restrictions.
Window
Supported without restrictions.
See the Window Transformation chapter in this guide for more information.
Transformations that are not listed here are not supported.
For more information about the transformations, see the
Informatica Developer Transformation Guide
.
For more information about transformation support on the Spark engine, see the
Informatica Big Data Management User Guide
.