Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings
  4. Sources
  5. Targets
  6. Transformations
  7. Data Preview
  8. Cluster Workflows
  9. Profiles
  10. Monitoring
  11. Hierarchical Data Processing
  12. Hierarchical Data Processing Configuration
  13. Hierarchical Data Processing with Schema Changes
  14. Intelligent Structure Models
  15. Stateful Computing
  16. Connections
  17. Data Type Reference
  18. Function Reference

Java Transformation on the Spark Engine

Java Transformation on the Spark Engine

Some processing rules for the Spark engine differ from the processing rules for the Data Integration Service.

General Restrictions

The Java transformation is supported with the following restrictions on the Spark engine:
  • The Java code in the transformation cannot write output to standard output when you push transformation logic to Hadoop. The Java code can write output to standard error which appears in the log files.
  • For date/time values, the Spark engine supports the precision of up to microseconds. If a date/time value contains nanoseconds, the trailing digits are truncated.

Partitioning

The Java transformation has the following restrictions when used with partitioning:
  • The Partitionable property must be enabled in the Java transformation. The transformation cannot run in one partition.
  • The following restrictions apply to the Transformation Scope property:
    • The value Transaction for transformation scope is not valid.
    • If you enable an input port for partition key, the transformation scope must be set to All Input.
    • Stateless must be enabled if the transformation scope is row.

Mapping Validation

Mapping validation fails in the following situations:
  • You reference an unconnected Lookup transformation from an expression within a Java transformation.
  • You select a port of a complex data type as the partition or sort key.
  • You enable nanosecond processing in date/time and the Java transformation contains a port of complex data type with an element of a date/time type. For example, a port of type
    array<data/time>
    is not valid if you enable nanosecond processing in date/time.
  • When you enable high precision, a validation error occurs if the Java transformation contains an expression that uses a decimal port or a complex port with an element of a decimal data type, and the port is used with an operator.
    For example, if the transformation contains a decimal port
    decimal_port
    and you use the expression
    decimal_port + 1
    , a validation error occurs.
The mapping fails in the following situation:
  • The Java transformation and the mapping use different precision modes when the Java transformation contains a decimal port or a complex port with an element of a decimal data type.
    Even if high precision is enabled in the mapping, the mapping processes data in low-precision mode in some situations, such as when the mapping contains a complex port with an element of a decimal data type, or the mapping is a streaming mapping. If high precision is enabled in both the Java transformation and the mapping, but the mapping processes data in low-precision mode, the mapping fails.

Using External .jar Files

To use external .jar files in a Java transformation, perform the following steps:
  1. Copy external .jar files to the Informatica installation directory in the Data Integration Service machine at the following location:
    <Informatic installation directory>/services/shared/jars
  2. Recycle the Data Integration Service.
  3. On the machine that hosts the Developer tool where you develop and run the mapping that contains the Java transformation:
    1. Copy external .jar files to a directory on the local machine.
    2. Edit the Java transformation to include an import statement pointing to the local .jar files.
    3. Update the classpath in the Java transformation.
    4. Compile the transformation.


Updated January 20, 2020