Table of Contents

Search

  1. Installation Getting Started
  2. Before You Install the Services
  3. Run the Big Data Suite Installer
  4. After You Install the Services
  5. Install the Developer Tool
  6. Uninstallation
  7. Starting and Stopping Informatica Services
  8. Connecting to Databases
  9. Updating the DynamicSections Parameter of a DB2 Database
  10. Silent Input Properties File

Installation and Configuration Guide

Installation and Configuration Guide

Prerequisites for the Embedded Cluster

Prerequisites for the Embedded Cluster

Before you install Enterprise Data Catalog on an embedded Hadoop cluster, you must verify that the system environment meets the prerequisites required to deploy Enterprise Data Catalog.
Verify that the internal Hadoop distribution meets the following prerequisites:
  • Operating system is 64-bit Red Hat Enterprise Linux version 6.5 or later.
    For Red Hat Enterprise Linux version 7.0, make sure that you are using the following versions of snappy-devel and Sudo:
    • snappy-devel-1.0.5-1.el6.x86_64 on all Apache Ambari hosts.
    • Sudo 1.8.16
  • Verify that you disable SSL certificate validation if you are using Red Hat Enterprise Linux.
  • Verify that the cluster nodes meet the following requirements:
    Node Type
    Minimum Requirements
    Master node
    • The number of CPUs is 4.
    • Unused memory available for use is 16 GB.
    • Disk space is 60 GB.
    Slave node
    • The number of CPUs is 4.
    • Unused memory available for use is 16 GB.
    • Disk space is 60 GB.
  • If the cluster is enabled for SSL, ensure that you import the Ambari Server certificate to the Informatica domain truststore.
  • Verify that the root directory (
    /
    ) has a minimum of 10 GB of free disk space.
  • If you want to mount Informatica Cluster Service on a separate mount location, verify that the mount location has a minimum of 50 GB of free disk space.
  • Verify that the Linux repository includes postgresql version 8.14.18, release 1.el6_4, installed or install the listed version and release of postgresql.
  • Make sure that you merge the user and host keytab files before you enable Kerberos authentication for Informatica Cluster Service.
  • Verify that you install the following prerequisite packages before you enable Kerberos:
    • krb5-workstation
    • krb5-libs
    • krb5-auth-dialog
  • Make sure that the
    NOEXEC
    flag is not set for the file system mounted on the
    /tmp
    directory.
  • Ensure that the Linux base repositories are configured.
  • Verify that you have the write permission on the
    /home
    directory.
  • On each host machine, verify that you have the following tools and applications available:
    • YUM and RPM (RHEL/CentOS/Oracle Linux)
    • Zypper and php_curl (SLES)
    • apt (Ubuntu)
    • scp, curl, unzip, tar, and wget
    • awk
    • OpenSSL version 1.0.1e-30.el6_6.5.x86_64 or later. Make sure that you do not use versions in the 1.0.2 branch.
      Make sure that the $PATH variable points to the
      /usr/bin
      directory to use the correct version of Linux OpenSSL.
    • Verify that the secure path in the
      /etc/sudoers
      file has the
      /usr/bin
      directory location at the start.
    • Python version 2.6.x for Red Hat Enterprise Linux version 6.5.
      If you install SUSE Linux Enterprise 11, update all the hosts to Python version 2.6.8-0.15.1.
    • Python version 2.7.x for Red Hat Enterprise Linux version 7.0.
    • If you install on SUSE Linux Enterprise 12, make sure that you install the following RPM Package Manager (RPMs) on all the cluster nodes:
      • openssl-1.0.1c-2.1.3.x86_64.rpm
      • libopenssl1_0_0-1.0.1c-2.1.3.x86_64.rpm
      • libopenssl1_0_0-32bit-1.0.1c-2.1.3.x86_64.rpm
      • python-devel-2.6.8-0.15.1.x86_64
    • If you have not configured the Linux base repository or if you do not have an Internet connection, install the following packages:
      • Version 8.4 of the following RPMs on the Ambari Server host:
        • postgresql-libs
        • postgresql-server
        • postgresql
      • The following RPMs on all cluster nodes:
        • nc
        • redhat-lsb
        • psmisc
        • python-devel-2.7.5-34.el7.x86_64
    • If you do not have an Internet connection, make sure that you have installed Java Development Kit (JDK) version 1.8. Configure the JAVA_HOME environment variable to point to the JDK installation.
    • If you have an Internet connection and any version of JDK installed, uninstall the JDK.
      Enterprise Data Catalog installs JDK version 1.8 and PostgreSQL version 8.4 as part of Apache Ambari installation. The location of the JDK package is
      /var/lib/ambari-server/resources/jdk-8u60-linux-x64.tar.gz
      .
  • Ensure that you install JDK 1.8 on all cluster nodes.
  • Apache Ambari requires certain ports that are open and available during the installation to communicate with the hosts that Apache Ambari deploys and manages. You need to temporarily disable the iptables to meet this requirement.
  • Verify that you meet the memory and package requirements for Apache Ambari. For more information, see the Hortonworks documentation.
  • Make sure that each machine in the cluster includes the
    127.0.0.1 localhost localhost.localdomain
    entry in the
    /etc/hosts
    file.
  • Verify that the
    /etc/hosts
    file includes the fully-qualified host names for all the cluster nodes. Alternatively, make sure that reverse DNS lookup returns the fully-qualified host names for all the cluster nodes.
  • Before you deploy Enterprise Data Catalog on clusters where Apache Ranger is enabled, make sure that you configure the following permissions for the Informatica domain user:
    • Write permission on the HDFS folder.
    • Permission to submit applications to the YARN queue.
  • If the cluster is enabled for SSL, it is recommended to enable SSL for the Informatica domain, the Informatica Cluster Service, and the Catalog Service.
  • If you want to enable Kerberos authentication for Enterprise Data Catalog deployed on a multi-node Informatica domain, make sure that you complete the following prerequisites:
    • Make sure that all the domain nodes include the
      krb5.conf
      file in the following directories:
      • $INFA_HOME/services/shared/security/
      • /etc/
    • Make sure that the
      /etc/hosts
      file of all cluster nodes and domain nodes include the krb hosts entry and a host entry for other nodes.
    • Install
      krb5-workstation
      in all domain nodes.
    • Make sure that the keytab file is present in a common location on all domain nodes.
  • If you want to enable SSL authentication for Enterprise Data Catalog deployed on a multi-node Informatica domain, make sure that you complete the following prerequisites:
    • Export the Default.keystore of each node to the infa_truststore.jks on all nodes.
    • Make sure that the Default.keystore is unique for each host node.
    • Copy the Default.keystore to a unique location of each node.
    • If Informatica Cluster Service and Catalog Service are on different nodes, then export the Apache Ambari server certificate to the infa_truststore.jks on all nodes.