Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings in the Hadoop Environment
  4. Mapping Sources in the Hadoop Environment
  5. Mapping Targets in the Hadoop Environment
  6. Mapping Transformations in the Hadoop Environment
  7. Processing Hierarchical Data on the Spark Engine
  8. Configuring Transformations to Process Hierarchical Data
  9. Processing Unstructured and Semi-structured Data with an Intelligent Structure Model
  10. Stateful Computing on the Spark Engine
  11. Monitoring Mappings in the Hadoop Environment
  12. Mappings in the Native Environment
  13. Profiles
  14. Native Environment Optimization
  15. Cluster Workflows
  16. Connections
  17. Data Type Reference
  18. Function Reference
  19. Parameter Reference

Big Data Management User Guide

Big Data Management User Guide

Java Transformation Suppport on the Hive Engine

Java Transformation Suppport on the Hive Engine

You can enable the Stateless advanced property when you run mappings in a Hadoop environment.
The Java code in the transformation cannot write output to standard output when you push transformation logic to Hadoop. The Java code can write output to standard error which appears in the log files.
Some processing rules for the Hive engine differ from the processing rules for the Data Integration Service.

Partitioning

You can optimize the transformation for faster processing when you enable an input port as a partition key and sort key. The data is partitioned across the reducer tasks and the output is partially sorted.
The following restrictions apply to the Transformation Scope property:
  • The value Transaction for transformation scope is not valid.
  • If transformation scope is set to Row, a Java transformation is run by mapper script.
  • If you enable an input port for partition Key, the transformation scope is set to All Input. When the transformation scope is set to All Input, a Java transformation is run by the reducer script and you must set at least one input field as a group-by field for the reducer key.

Using External .jar Files

To use external .jar files in a Java transformation, perform the following steps:
  1. Copy external .jar files to the Informatica installation directory in the Data Integration Service machine at the following location:
    <Informatic installation directory>/services/shared/jars
    . Then recycle the Data Integration Service.
  2. On the machine that hosts the Developer tool where you develop and run the mapping that contains the Java transformation:
    1. Copy external .jar files to a directory on the local machine.
    2. Edit the Java transformation to include an import statement pointing to the local .jar files.
    3. Update the classpath in the Java transformation.
    4. Compile the transformation.

0 COMMENTS

We’d like to hear from you!