Table of Contents


  1. Preface
  2. Introduction to Hadoop Integration
  3. Before You Begin
  4. Amazon EMR Integration Tasks
  5. Azure HDInsight Integration Tasks
  6. Cloudera CDH Integration Tasks
  7. Hortonworks HDP Integration Tasks
  8. MapR Integration Tasks
  9. Appendix A: Connections

Create a Cluster Configuration

After the Hadoop administrator prepares the cluster for import, the Informatica administrator must create a cluster configuration.
Perform this task in the following situations:
  • You are integrating for the first time.
  • You upgraded from a version earlier than 10.2.
  • You upgraded from 10.2 and changed the distribution or distribution version.
A cluster configuration is an object in the domain that contains configuration information about the Hadoop cluster. The cluster configuration enables the Data Integration Service to push mapping logic to the Hadoop environment. Import configuration properties from the Hadoop cluster to create a cluster configuration.
The import process imports values from *-site.xml files into configuration sets based on the individual *-site.xml files. When you perform the import, the cluster configuration wizard can create Hadoop, HBase, HDFS, and Hive connection to access the Hadoop environment. If you choose to create the connections, the wizard also associates the cluster configuration with the connections.
If you imported the cluster configuration when you installed Enterprise Data Lake with the Informatica domain, you can create the cluster configuration again or refresh the cluster configuration.
For more information about the cluster configuration, see the
Big Data Management Administrator Guide


We’d like to hear from you!