Table of Contents

Search

  1. Preface
  2. Part 1: Hadoop Integration
  3. Part 2: Databricks Integration
  4. Appendix A: Connections Reference

Prepare for Cluster Import from Azure HDInsight

Prepare for Cluster Import from Azure HDInsight

Before the Informatica administrator can import cluster information to create a cluster configuration in the Informatica domain, the Hadoop administrator must perform some preliminary tasks.
Perform this task in the following situations:
  • You are integrating for the first time.
  • You upgraded from any previous version.
If you are upgrading from a previous version, verify the properties and suggested values, as Data Engineering Integration might require additional properties or different values for existing properties.
Complete the following tasks to prepare the cluster before the Informatica administrator creates the cluster configuration:
  1. When the Informatica domain is on-premises, verify that the VPN is enabled between the domain and the Azure HDInsight cloud network.
  2. When the Informatica domain is deployed on the Azure cloud, verify the following requirements:
    • Verify that the domain can access the private or internal IP addresses of all HDInsight cluster nodes and can connect to the required ports. For a list of ports, see the HDInsight ports listed in the article Configuring Ports for Big Data Products.
    • When the domain and the HDInsight cluster reside in different virtual networks, known as "Vnets," see the Azure documentation to enable peering between virtual networks.
  3. Verify property values in *-site.xml files that Data Engineering Integration needs to run mappings in the Hadoop environment.
  4. Provide information to the Informatica administrator that is required to import cluster information into the domain. Depending on the method of import, perform one of the following tasks:
    • To import directly from the cluster, give the Informatica administrator cluster authentication information to connect to the cluster.
    • To import from an archive file, export cluster information and provide an archive file to the Informatica administrator.