Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Java Transformation on the Spark Engine

Java Transformation on the Spark Engine

Some processing rules for the Spark engine differ from the processing rules for the Data Integration Service.

General Restrictions

The Java transformation is supported with the following restrictions on the Spark engine:
  • The Java code in the transformation cannot write output to standard output when you push transformation logic to Hadoop. The Java code can write output to standard error which appears in the log files.
  • For date/time values, the Spark engine supports the precision of up to microseconds. If a date/time value contains nanoseconds, the trailing digits are truncated.

Partitioning

The Java transformation has the following restrictions when used with partitioning:
  • The Partitionable property must be enabled in the Java transformation. The transformation cannot run in one partition.
  • The following restrictions apply to the Transformation Scope property:
    • The value Transaction for transformation scope is not valid.
    • If you enable an input port for partition key, the transformation scope must be set to All Input.
    • Stateless must be enabled if the transformation scope is row.

Mapping Validation

Mapping validation fails in the following situations:
  • You reference an unconnected Lookup transformation from an expression within a Java transformation.
  • You select a port of a complex data type as the partition or sort key.
  • You enable nanosecond processing in date/time and the Java transformation contains a port of complex data type with an element of a date/time type. For example, a port of type
    array<data/time>
    is not valid if you enable nanosecond processing in date/time.
The mapping fails in the following situations:
  • The Java transformation and the mapping use different precision modes when the Java transformation contains a decimal port or a complex port with an element of a decimal data type.
    Even if high precision is enabled in the mapping, the mapping processes data in low-precision mode in some situations, such as when the mapping contains a complex port with an element of a decimal data type, or the mapping is a streaming mapping. If high precision is enabled in both the Java transformation and the mapping, but the mapping processes data in low-precision mode, the mapping fails.
  • Binary null characters are passed to an output port. To avoid a mapping failure, you can add code to the Java transformation that replaces the binary null characters with an alternative character before writing the data to the output ports.

Using External .jar Files

To use external .jar files in a Java transformation, perform the following steps:
  1. Copy external .jar files to the Informatica installation directory in the Data Integration Service machine at the following location:
    <Informatic installation directory>/services/shared/jars
  2. Recycle the Data Integration Service.
  3. On the machine that hosts the Developer tool where you develop and run the mapping that contains the Java transformation:
    1. Copy external .jar files to a directory on the local machine.
    2. Edit the Java transformation to include an import statement pointing to the local .jar files.
    3. Update the classpath in the Java transformation.
    4. Compile the transformation.

0 COMMENTS

We’d like to hear from you!