Table of Contents

Search

  1. Preface
  2. Introduction to Hadoop Integration
  3. Before You Begin
  4. Amazon EMR Integration Tasks
  5. Azure HDInsight Integration Tasks
  6. Cloudera CDH Integration Tasks
  7. Hortonworks HDP Integration Tasks
  8. MapR Integration Tasks
  9. Appendix A: Connections

Hadoop Integration Guide

Hadoop Integration Guide

Replace Hive Run-time Connections with Hadoop Connections

Replace Hive Run-time Connections with Hadoop Connections

Big Data Management requires a Hadoop connection to run mappings on the Hadoop cluster. If you used Hive connections to run mappings on the Hadoop cluster, you must generate Hadoop connections from the Hive connections.
The upgrade process generates a connection name for the Hadoop connection and replaces the connection name in the mappings. It does not create the physical connection object. When the upgrade is complete, you must run a command to generate the connection. You generate Hadoop connections from Hive connections that are configured to run mappings in the Hadoop environment.
Complete the following tasks to upgrade connections:
Generate Hadoop connections
You must generate Hadoop connections from Hive connections that are configured to run mappings in the Hadoop environment.
  1. Run
    infacmd isp generateHadoopConnectionFromHiveConnection
    to generate a Hadoop connection from a Hive connection that is configured to run in the Hadoop environment.
    The command names the connection as follows: "Autogen_<Hive connection name>." If the connection name exceeds the 128 character limit, the command fails.
  2. If the command fails, complete the following tasks:
    1. Rename the connection to meet the character limit and run the command again.
    2. Run
      infacmd dis replaceMappingHadoopRuntimeConnections
      to replace connections associated with mappings that are deployed in applications.
    3. Run
      infacmd mrs replaceMappingHadoopRuntimeConnections
      to replace connections associated with mappings that you run from the Developer tool.
  3. If the Hive connection was parameterized, you must update the connection names in the parameter file. Verify that the Hive sources, Hive targets, and the Hive engine parameters are updated with the correct connection name.
  4. If any properties changed in the cluster, such as host names, URIs, or port numbers, you must update the properties in the connections.
Associate the cluster configuration
The Hadoop, Hive, HDFS, and HBase connections must be associated with a cluster configuration. Complete the following tasks:
  1. Run
    infacmd isp listConnections
    to identify the connections that you need to upgrade. Use
    -ct
    to list connections of a particular type.
  2. Run
    infacmd isp UpdateConnection
    to associate the cluster configuration with the connection. Use
    -cn
    to name the connection and
    -o clusterConfigID
    to associate the cluster configuration with the connection.
For information about the infacmd commands, see the
Informatica Command Reference
.

0 COMMENTS

We’d like to hear from you!