Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings
  4. Sources
  5. Targets
  6. Transformations
  7. Data Preview
  8. Cluster Workflows
  9. Profiles
  10. Monitoring
  11. Hierarchical Data Processing
  12. Hierarchical Data Processing Configuration
  13. Hierarchical Data Processing with Schema Changes
  14. Intelligent Structure Models
  15. Stateful Computing
  16. Appendix A: Connections
  17. Appendix B: Data Type Reference
  18. Appendix C: Function Reference

Execution Environment

Execution Environment

Configure non-native properties, pushdown configuration properties, and source configuration properties in the
Execution Environment
area.
The following table describes properties that you can configure for the Hadoop and Databricks environments:
Name
Description
Connection
Configure for the Hadoop and Databricks environments.
Defines the connection information that the Data Integration Service requires to push the mapping execution to the compute cluster. Select the non-native connection to run the mapping in the compute cluster. You can assign a user-defined parameter for the non-native connection.
Runtime Properties
Configure for the Hadoop environment.
You can configure run-time properties for the Hadoop environment in the Data Integration Service, the Hadoop connection, and in the mapping. You can override a property configured at a high level by setting the value at a lower level. For example, if you configure a property in the Data Integration Service custom properties, you can override it in the Hadoop connection or in the mapping. The Data Integration Service processes property overrides based on the following priorities:
  1. Mapping custom properties set using
    infacmd ms runMapping
    with the
    -cp
    option
  2. Mapping run-time properties for the Hadoop environment
  3. Hadoop connection advanced properties for run-time engines
  4. Hadoop connection advanced general properties, environment variables, and classpaths
  5. Data Integration Service custom properties
Reject File Directory
Configure for the Hadoop environment.
The directory for Hadoop mapping files on HDFS when you run mappings in the Hadoop environment.
The Blaze engine can write reject files to the Hadoop environment for flat file, HDFS, and Hive targets. The Spark engine can write reject files to the Hadoop environment for flat file and HDFS targets.
Choose one of the following options:
  • On the Hadoop Cluster. The reject files are moved to the reject directory configured in the Hadoop connection. If the directory is not configured, the mapping will fail.
  • Defer to the Hadoop Connection. The reject files are moved based on whether the reject directory is enabled in the Hadoop connection properties. If the reject directory is enabled, the reject files are moved to the reject directory configured in the Hadoop connection. Otherwise, the Data Integration Service stores the reject files based on the RejectDir system parameter.
You can configure the following pushdown configuration properties:
Name
Description
Pushdown type
Configure for the Hadoop environment.
Choose one of the following options:
  • None. Select no pushdown type for the mapping.
  • Source. The Data Integration Service tries to push down transformation logic to the source database.
  • Full. The Data Integration Service pushes the full transformation logic to the source database.
Pushdown Compatibility
Configure for the Hadoop environment.
Optionally, if you choose full pushdown optimization and the mapping contains an Update Strategy transformation, you can choose a pushdown compatibility option or assign a pushdown compatibility parameter.
Choose one of the following options:
  • Multiple rows do not have the same key. The transformation connected to the Update Strategy transformation receives multiple rows without the same key. The Data Integration Service can push the transformation logic to the target.
  • Multiple rows with the same key can be reordered. The target transformation connected to the Update Strategy transformation receives multiple rows with the same key that can be reordered. The Data Integration Service can push the Update Strategy transformation to the non-native environment.
  • Multiple rows with the same key cannot be reordered. The target transformation connected to the Update Strategy transformation receives multiple rows with the same key that cannot be reordered. The Data Integration Service cannot push the Update Strategy transformation to the non-native environment.
You can configure the following source properties for the Hadoop and Databricks environments:
Name
Description
Maximum Rows Read
Reserved for future use.
Maximum Runtime Interval
Reserved for future use.
State Store
Reserved for future use.


Updated July 10, 2020