Table of Contents

Search

  1. Installation Getting Started
  2. Before You Install the Services
  3. Run the Big Data Suite Installer
  4. After You Install the Services
  5. Install the Developer Tool
  6. Uninstallation
  7. Starting and Stopping Informatica Services
  8. Connecting to Databases
  9. Updating the DynamicSections Parameter of a DB2 Database

Installation and Configuration Guide

Installation and Configuration Guide

Configure the Catalog Service for the External Cluster

Configure the Catalog Service for the External Cluster

After you can configure the external cluster, you can configure the catalog service parameters for the existing cluster.
  1. Enter the information to configure the Catalog Service parameters for the existing cluster, if the cluster uses Kerberos authentication.
    The following table describes the properties you need to set for configuring the Catalog Service parameters for the existing cluster.
    Option
    Description
    Catalog Service name
    Name of the Catalog Service.
    Catalog Service port
    Port number of the Catalog Service.
    Cluster Hadoop distribution URL
    URL to access the Hadoop cluster.
    Cluster Hadoop distribution URL user
    User name to access the Hadoop cluster.
    Cluster Hadoop distribution URL password
    Password to access the Hadoop cluster.
    Service cluster name
    Name of the service cluster.
    KDC domain name
    Domain name of the Kerberos Key Distribution Center.
    Keytab location
    Location of the Kerberos Key Distribution Center (KDC).
    Fully qualified path to the Kerberos configuration file
    Location of the fully qualifies path to the Kerberos configuration file.
    YARN Queue Name
    The YARN scheduler queue name used by the Blaze engine that specifies available resources on a cluster.
  2. If you chose cluster type as Others, enter the information to configure the Catalog Service parameters for the existing cluster.
    The following table describes the properties you need to set for configuring the Catalog Service parameters for the existing cluster.
    Option
    Description
    Catalog Service name
    Name of the Catalog Service.
    Catalog Service port
    Port number of the Catalog Service.
    Yarn resource manager URI
    Applies to external cluster. The service within Hadoop that submits the MapReduce tasks to specific nodes in the cluster.
    Use the following format:<Hostname>:<Port>
    Where
    <host name> is the name or IP address of the Yarn resource manager.-
    <port number> is the port number on which Yarn resource manager listens for Remote Procedure Calls (RPC).
    Yarn resource manager HTTPS or HTTP URI
    Applies to external cluster. https or http URI value for the Yarn resource manager.
    Yarn resource manager scheduler URII
    Applies to external cluster. Scheduler URI value for the Yarn resource manager.
    Zookeeper Addresses
    Multiple ZooKeeper addresses in a comma-separated list.
    HDFS Nodename URI
    Applies to external cluster. The URI to access HDFS.
    Use the following format to specify the NameNode URI in the Cloudera distribution:<Hostname>:<Port>
    Where
    • <host name> is the host name or IP address of the NameNode
    • <port number> is the port number that the NameNode listens for Remote Procedure Calls (RPC).
    History Server HTTP URI
    Applies to external cluster. Specify a value to generate YARN allocation log files for scanners. Catalog Administrator displays the log URL as part of task monitoring.
    Service cluster name
    Name of the service cluster.
    HDFS Service Name for High Availability
    Applies to highly available external cluster. Specify the HDFS service namApplies to both internal and external clusters. Name of the service cluster. Ensure that you have a directory
    /Informatica/LDM/<ServiceClusterName>
    in HDFS.
    If you do not specify a service cluster name, Enterprise Data Catalog considers DomainName_CatalogServiceName as the default value. You must then have the
    /Informatica/LDM/<DomainName>_<CatalogServiceName>
    directory in HDFS. Otherwise, Catalog Service might fail.
    HDFS Service Principal Name
    Applies to Kerberos authentication. Principal name for the HDFS Service.
    YARN Service Principal Name
    Applies to Kerberos authentication. Principal name for the YARN Service.
    KDC domain name
    The domain name of the Kerberos Key Distribution Center (KDC).
    Keytab location
    The location of the Kerberos Key Distribution Center (KDC).
    Fully qualified path to the Kerberos configuration file
    Location of the fully qualifies path to the Kerberos configuration file.
    YARN Queue Name
    The YARN scheduler queue name used by the Blaze engine that specifies available resources on a cluster.
  3. Select the load type.
    The following table describes the options you can choose.
    Option
    Description
    Demo
    Represents single datastore. Used for demo purpose.
    Low
    Represents one million assets or 30-40 datastores.
    Medium
    Represents 20 million assets or 200-400 datastores.
    High
    Represents 50 million assets or 500-100 datastores.
The
Model Repository Database
section appears.