Table of Contents

Search

  1. Preface
  2. Part 1: Installation Getting Started
  3. Part 2: Before You Install the Services
  4. Part 3: Run the Services Installer
  5. Part 4: After You Install the Services
  6. Part 5: Informatica Client Installation
  7. Part 6: Uninstallation
  8. Appendix A: Starting and Stopping Informatica Services
  9. Appendix B: Managing Distribution Packages
  10. Appendix C: Connecting to Databases from UNIX or Linux
  11. Appendix D: Updating the DynamicSections Parameter of a DB2 Database

Installation for Data Engineering

Installation for Data Engineering

Create the Cluster Configuration

Create the Cluster Configuration

After you configure the data profiling warehouse connection, you can create the cluster configuration to connect to the non-native environment.
  1. Enter the name of the cluster configuration to create.
  2. Specify the non-native distribution for the cluster.
    The following table describes the options you can specify:
    Prompt
    Description
    1
    Cloudera. You can create a cluster configuration for a Cloudera cluster on
    either Cloudera Data Platform (CDP) or for
    Cloudera Distribution Hadoop (CDH).
    2
    Hortonworks
    3
    Azure HDInsight
    4
    MapR. You must import MapR cluster configuration properties from an archive file.
    5
    Amazon EMR. You must import Amazon EMR cluster configuration properties from an archive file.
    6
    Databricks
    7
    Google Dataproc
    Before you import Amazon EMR cluster configuration properties, verify that the following ports associated with Amazon EMR are available:
    Hadoop Component
    Port
    HDFS read
    50010
    HDFS write
    50020
    Hive metastore
    9083
    HiveServer
    10000
    MySQL
    3306
    NameNode
    8020
    ResourceManager
    8050
    ResourceManager webapp
    8088
    ResourceTracker
    8031
    Scheduler address
    8030
    Shuffle HTTP
    13562
    ZooKeeper
    2181
  3. Import configuration properties from the non-native environment to create the cluster configuration.
    • To import the properties from an archive file, press
      1
      . If you create a cluster configuration for an Amazon EMR, MapR, or Google Dataproc cluster, you must import the properties from an archive file.
    • To import the properties directly from the cluster, press
      2
      .
  4. If you choose to import the properties from an archive file, you must choose the configuration archive file name and path to the file.
  5. If you choose to import the properties directly from the cluster, specify the connection properties.
    The following table describes the Cloudera, Hortonworks, or Azure HDInsight cluster properties you specify:
    Property
    Description
    Host
    The host name or IP address of the cluster manager.
    Port
    Port of the cluster manager.
    User ID
    Cluster user name.
    Password
    Password for the cluster user.
    Cluster Name
    Name of the cluster. Use the display name if the cluster manager manages multiple clusters. If you do not provide a cluster name, the wizard imports information based on the default cluster.
    Engine type
    If you specified a Cloudera cluster, the installer prompts for the engine type.
    If you are on a CDP cluster, accept the default engine type of Tez. If you are on a CDH cluster, press
    2
    to set the engine type to MRv2.
    Default is
    1
    .
    The following table describes the Databricks cluster properties you specify:
    Property
    Description
    Databricks domain
    Enter the URL of the Databricks cluster.
    Databricks token ID
    Enter the token ID of the Databricks cluster.
    Databricks cluster ID
    Enter the cluster ID of the Databricks cluster.
  6. To create the Hadoop, HDFS, Hive, HBase, or Databricks connections to the cluster, press
    1
    .
    The installer appends the connection type to the cluster configuration name to create a connection name.

0 COMMENTS

We’d like to hear from you!