Table of Contents


  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Connections
  4. Mappings in a Hadoop Environment
  5. Mapping Objects in a Hadoop Environment
  6. Mappings in the Native Environment
  7. Profiles
  8. Native Environment Optimization
  9. Data Type Reference
  10. Function Reference
  11. Parameter Reference



Define the connections that you want to use to access data in HBase, HDFS, Hive, or relational databases, or run a mapping on a Hadoop cluster. You can create the connections using the Developer tool, Administrator tool, and infacmd.
You can create the following types of connections:
Hadoop connection
Create a Hadoop connection to run mappings on the Hadoop cluster. Select the Hadoop connection if you select the Hadoop run-time environment. You must also select the Hadoop connection to validate a mapping to run on the Hadoop cluster. Before you run mappings in the Hadoop cluster, review the information in this guide about rules and guidelines for mappings that you can run in the Hadoop cluster.
HDFS connection
Create an HDFS connection to read data from or write data to the HDFS file system on the Hadoop cluster.
HBase connection
Create an HBase connection to access HBase. The HBase connection is a NoSQL connection.
Hive connection
Create a Hive connection to access Hive as a source or target. You can access Hive as a source if the mapping is enabled for the native or Hadoop environment. You can access Hive as a target only if the mapping uses the Hive engine.
JDBC connection
Create a JDBC connection and configure Sqoop properties in the connection to import and export relational data through Sqoop. You must also create a Hadoop connection to run the mapping on the Hadoop cluster.
For information about creating connections to other sources or targets such as social media web sites or Teradata, see the respective PowerExchange adapter user guide for information.

Updated July 03, 2018