Table of Contents

Search

  1. Installation Getting Started
  2. Before You Install the Services
  3. Run the Big Data Suite Installer
  4. After You Install the Services
  5. Install the Developer Tool
  6. Uninstallation
  7. Starting and Stopping Informatica Services
  8. Connecting to Databases
  9. Updating the DynamicSections Parameter of a DB2 Database
  10. Silent Input Properties File

Installation and Configuration Guide

Installation and Configuration Guide

Installing by Creating a Domain

Installing by Creating a Domain

Create a domain if you are installing for the first time or if you want to administer nodes in separate domains.
  1. Log in to the machine with a system user account.
  2. Close all other applications.
  3. Run the
    ./install.sh
    command to start the installer.
    The installer displays the message to read Informatica documentation before you proceed with the installation.
  4. Press
    Y
    to continue the installation.
  5. Press
    1
    to install Informatica Big Data suite products.
  6. Press
    1
    to run the Pre-installation System Check tool. The tool verifies if your machine meets the minimum system requirements to install or upgrade Informatica.
    You can skip this step if you are sure that your machine meets the minimum system requirements to install or upgrade Informatica.
  7. Press
    3
    to install Informatica.
  8. Press
    2
    to agree to the terms and conditions of the installation or upgrade.
  9. Press
    2
    to agree that you understand version 10.2.1 is specific to Big Data suite of products and continue with the installation.
  10. Press
    2
    to install Informatica application services with Enterprise Data Catalog.
    The installer prompts you to confirm that the current version of the Informatica application services is not installed on the node.
  11. Press
    1
    if you do not have the current version of the Informatica application services installed, else, press
    2
    .
  12. Choose the Hadoop cluster type for Enterprise Data Catalog. Press
    2
    to deploy Enterprise Data Catalog on an embedded Hadoop distribution. Press
    1
    to deploy Enterprise Data Catalog on an existing Hadoop distribution.
    • If you chose the embedded Hadoop distribution, provide the following information after configuring the Informatica domain, the Model Repository Service, and the Data Integration Service:
    Option
    Description
    SSH username
    Username for the password-less Secure Shell (SSH) connection
    Informatica Cluster service name
    Name of the Informatica Cluster Service for the internal cluster.
    Informatica Cluster service port
    Port number for the Informatica Cluster Service.
    Ambari server host
    Host information for the Ambari server. Ambari is a web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters, which includes support for Hadoop HDFS, Hadoop MapReduce, Hive, HBase and ZooKeeper.
    Comma-separated Ambari agent hosts
    Applies to high availability. If you use multiple Ambari agent hosts, specify the comma-separated values of multiple Ambari agent host names.
    Ambari web port
    Port number where the Ambari server needs to run.
    Catalog service name
    Name of the catalog service.
    Catalog service port
    Port number of the catalog service.
    Keytab Location
    Applies to a Kerberos-enabled cluster. Location of the merged user and host keytab file.
    Kerberos configuration file
    Applies to a Kerberos-enabled cluster. Location of the Kerberos configuration file.
    • If you chose the external Hadoop distribution, specify if you need to have Kerberos authentication enabled for the cluster. Then, enter the following information:
    Option
    Description
    Catalog service name
    Name of the catalog service.
    Catalog service port
    Port number of the catalog service.
    Yarn resource manager URI
    The service within Hadoop that submits the MapReduce tasks to specific nodes in the cluster.
    Use the following format:
    <hostname>:<port>
    Where
    • hostname
      is the name or IP address of the Yarn resource manager.
    • port
      is the port on which Yarn resource manager listens for Remote Procedure Calls (RPC).
    Yarn resource manager http URI
    http URI value for the Yarn resource manager.
    Yarn resource manager scheduler URI
    Scheduler URI value for the Yarn resource manager.
    Zookeeper URI
    The URI for the Zookeeper service, which is a high-performance coordination service for distributed applications.
    HDFS namenode URI
    The URI to access HDFS.
    Use the following format to specify the NameNode URI in the Cloudera distribution:
    hdfs://<namenode>:<port>
    Where
    • <namenode> is the host name or IP address of the NameNode.
    • <port> is the port that the NameNode listens for Remote Procedure Calls (RPC).
    Service cluster name
    Name of the service cluster. Ensure that you have a directory
    /Informatica/LDM/<ServiceClusterName>
    in HDFS before the installation is complete.
    If you do not specify a service cluster name, Enterprise Data Catalog considers
    DomainName_CatalogServiceName
    as the default value. You must then have the
    /Informatica/LDM/<DomainName>_<CatalogServiceName>
    directory in HDFS. Otherwise, Catalog Service might fail.
    History Server HTTP URI
    HTTP URI to access the history server.
    Is Cluster Secure ?
    Set this property to one of the following values if you have an external cluster that is secure:
    • 1: specifies that the external cluster is not secure.
    • 2: specifies that the external cluster is secure.
    Default is 1.
    Is Cluster SSL Enabled?
    Set this property to one of the following values if you have an external cluster that is enabled for SSL:
    • 1: specifies that the external cluster is not enabled for SSL.
    • 2: specifies that the external cluster is enabled for SSl.
    Default is 1.
    Enable TLS for the Service?
    Set this property to one of the following values if you have an external cluster that is enabled for Transport Layer Security (TLS):
    • 1: specifies that the external cluster is not enabled for TLS.
    • 2: specifies that the external cluster is enabled for TLS.
    Default is 1.
    Is Cluster HA Enabled?
    Set this property to one of the following values if you have an external cluster that is enabled for high availability:
    • 1: specifies that the external cluster is not enabled for high availability.
    • 2: specifies that the external cluster is enabled for high availability.
    Default is 1.
    Depending on the settings that you specify, Enterprise Data Catalog creates an Informatica Cluster Service for internal Hadoop distribution.
  13. Press
    2
    to confirm that you have read and accepted terms and conditions to use Java SE Development Kit software.
  14. Press
    Enter
    to continue.
  15. Press
    2
    if you want the installer to tune the Informatica application services based on the size of data that you want to deploy.
    The installer displays the following options for various data sizes:
    • Sandbox
    • Basic
    • Standard
    • High Concurrency and High Volume
  16. Type the path and file name of the Informatica license key and press
    Enter
    .
  17. Type the absolute path for the installation directory.
    The directory names in the path must not contain spaces or the following special characters: @|* $ # ! % ( ) { } [ ] , ; ' Default is /home/toolinst.
    Informatica recommends using alphanumeric characters in the installation directory path. If you use a special character such as á or €, unexpected results might occur at run time.
  18. Press
    2
    to run the pre-validation utility. The utility helps you validate the prerequisites to install Enterprise Data Catalog in an embedded cluster. The utility also validates the Informatica domain, cluster hosts, and the Hadoop cluster services configuration.
    The installer prompts you to confirm if you want to enable Kerberos authentication for the cluster.
  19. Press
    2
    if you want to enable Kerberos authentication for the cluster and provide the following details:
    1. Keytab Location
      . Location of the merged user and host keytab file.
    2. Kerberos Configuration File
      . Location of the Kerberos configuration file.
  20. Type the gateway user name and press
    Enter
    . Default is
    root
    .
  21. Type the Informatica Hadoop cluster gateway hostname in the following format:
    <hostname>.<FQDN>
    and press
    Enter
    .
  22. Type the list of comma-separated Informatica Hadoop cluster nodes as shown in the following format:
    <hostname>.<FQDN>, <hostname1>.<FQDN>, <hostname2>.<FQDN>
    and press
    Enter
    .
  23. Type the Informatica Hadoop cluster gateway port and press
    Enter
    . Default is
    8080
    .
    To avoid a port conflict, make sure that you do not configure Oracle with port 8080 on the same machine where Informatica Cluster Service runs.
  24. Type the path to the working directory, and press
    Enter
    . The path indicates the location where you want to mount the Informatica Cluster Service.
    The installer starts the pre-validation utility.
  25. Press
    Enter
    to continue after running the pre-validation utility.
  26. Review the installation information, and press
    Enter
    to continue.
    The installer copies the Enterprise Data Catalog files to the installation directory. You see a prompt to create or join a domain.
  27. Press
    1
    to create a domain.
    When you create a domain, the node that you create becomes a gateway node in the domain. The gateway node contains a Service Manager that manages all domain operations.
  28. To enable secure communication for services in the domain, press
    2
    . To disable secure communication for the domain, press
    1
    .
    By default, if you enable secure communication for the domain, the installer sets up an HTTPS connection for the Informatica Administrator. You can also create a domain configuration repository on a secure database.
  29. Type the connection details for Informatica Administrator.
    1. If you do not enable secure communication for the domain, you can specify whether to set up a secure HTTPS connection for the Informatica Administrator.
      The following table describes the options available to enable or disable a secure connection to Informatica Administrator:
      Option
      Description
      1 - Enable HTTPS for Informatica Administrator
      Set up a secure connection to Informatica Administrator.
      2 - Disable HTTPS
      Do not set up a secure connection to Informatica Administrator.
    2. If you enable secure communication for the domain or if you enable HTTPS connection for the Informatica Administrator, enter the keystore file and port number for the HTTPS connection to Informatica Administrator.
      The following table describes the connection information you must enter if you enable HTTPS:
      Option
      Description
      Port
      Port number for the HTTPS connection.
      Keystore file
      Select whether to use a keystore file generated by the installer or a keystore file you create. You can use a keystore file with a self-signed certificate or a certificate signed by a certification authority.
      1 - Use a keystore generated by the installer
      2 - Specify a keystore file and password
      If you select to use a keystore file generated by the installer, the installer creates a self-signed keystore file named Default.keystore in the following location:
      <Informatica installation directory>/tomcat/conf/
    3. If you specify the keystore, enter the password and location of the keystore file.
  30. Press
    2
    if you want to enable Single sign-on using SAML authentication for Enterprise Data Catalog applications.
  31. Type the SAML Identity Provider (IdP) URL and press
    Enter
    .
    See the section
    Configure Single Sign-on with SAML Authentication
    for information about configuration you must complete after you install Enterprise Data Catalog.
    If you enabled secure communication for the domain, the
    Domain Security - Secure Communication
    section appears. If you did not enable secure communication for the domain, the
    Domain Configuration Repository
    section appears.
  32. In the Domain Security - Secure Communication section, specify whether to use the default Informatica SSL certificates or to use your SSL certificates to secure domain communication.
    1. Select the type of SSL certificates to use.
      The following table describes the options for the SSL certificates that you can use to secure the Informatica domain:
      Option
      Description
      1 - Use the default Informatica SSL certificate files
      Use the default SSL certificates provided by Informatica.
      If you do not provide an SSL certificate, Informatica uses the same default private key for all Informatica installations. If you use the default Informatica keystore and truststore files, the security of your domain could be compromised. To ensure a high level of security for the domain, select the option to specify the location of the SSL certificate files.
      2 - Specify the location of the SSL certificate files
      Use SSL certificates that you provide. You must specify the location of the keystore and truststore files.
      You can provide a self-signed certificate or a certificate issued by a certificate authority (CA). You must provide SSL certificates in PEM format and in Java Keystore (JKS) files. Informatica requires specific names for the SSL certificate files for the Informatica domain. You must use the same SSL certificates for all nodes in the domain. Store the truststore and keystore files in a directory accessible to all the nodes in the domain and specify the same keystore file directory and truststore file directory for all nodes in the same domain.
    2. If you provide the SSL certificate, specify the location and passwords of the keystore and truststore files.
      The following table describes the parameters that you must enter for the SSL certificate files:
      Property
      Description
      Keystore file directory
      Directory that contains the keystore files. The directory must contain files named infa_keystore.jks and infa_keystore.pem.
      Keystore password
      Password for the keystore infa_keystore.jks.
      Truststore file directory
      Directory that contains the truststore files. The directory must contain files named infa_truststore.jks and infa_truststore.pem.
      Truststore password
      Password for the infa_truststore.jks file.
    The Domain Configuration Repository section appears.
  33. Select the database to use for the domain configuration repository.
    The following table lists the databases you can use for the domain configuration repository:
    Prompt
    Description
    Database type
    Type of database for the domain configuration repository. Select from the following options:
    1 - Oracle
    2 - Microsoft SQL Server
    3 - IBM DB2
    4 - Sybase ASE
    The Informatica domain configuration repository stores metadata for domain operations and user authentication. The domain configuration repository must be accessible to all gateway nodes in the domain.
  34. Enter the properties for the database user account.
    The following table lists the properties for the database user account:
    Property
    Description
    Database user ID
    Name for the domain configuration database user account.
    User password
    Password for the domain configuration database user account.
  35. Choose whether to create a secure domain configuration repository.
    You can create a domain configuration repository in a database secured with the SSL protocol. To create a domain configuration repository in a secure database, press 1.
    To create a domain configuration repository in an unsecure database, press 2.
  36. If you do not want to create a secure domain configuration repository, enter the parameters for the database.
    1. If you select IBM DB2, select whether to configure a tablespace and enter the tablespace name.
      The following table describes the properties that you must configure for the IBM DB2 database:
      Property
      Description
      Configure tablespace
      Select whether to specify a tablespace:
      1 - No
      2 - Yes
      In a single-partition database, if you select No, the installer creates the tables in the default tablespace. In a multi-partition database, you must select Yes.
      Tablespace
      Name of the tablespace in which to create the tables. Specify a tablespace that meets the pageSize requirement of 32768 bytes.
      In a single-partition database, if you select Yes to configure the tablespace, enter the name of the tablespace in which to create the tables.
      In a multi-partition database, specify the name of the tablespace that resides in the catalog partition of the database.
    2. If you select Microsoft SQL Server, enter the schema name for the database.
      The following table describes the properties that you must configure for the Microsoft SQL Server database:
      Property
      Description
      Schema name
      Name of the schema that will contain domain configuration tables. If this parameter is blank, the installer creates the tables in the default schema.
    3. To enter the JDBC connection information using the JDBC URL information, press
      1
      . To enter the JDBC connection information using a custom JDBC connection string, press
      2
      .
    4. Enter the JDBC connection information.
      • To enter the connection information using the JDBC URL information, specify the JDBC URL properties.
        The following table describes the database connection information:
        Prompt
        Description
        Database host name
        Host name for the database.
        Database port number
        Port number for the database.
        Database service name
        Password for the domain configuration database user account.
        Service name for Oracle and IBM DB2 databases or database name for Microsoft
        Microsoft SQL Server and Sybase ASE.
        Configure JDBC Parameters
        Select whether to add additional JDBC parameters to the connection string:
        1 - Yes
        2 - No
        If you select Yes, enter the parameters or press Enter to accept the default. If you select No, the installer creates the JDBC connection string without parameters.
      • To enter the connection information using a custom JDBC connection string, type the connection string.
        Use the following syntax for the JDBC connection string for the databases:
        IBM DB2
        jdbc:Informatica:db2://host_name:port_no;DatabaseName=
        Oracle
        jdbc:Informatica:oracle://host_name:port_no;ServiceName=
        Microsoft SQL Server
        jdbc:Informatica:sqlserver://host_name:port_no;SelectMethod=cursor;DatabaseName=
        Sybase
        jdbc:Informatica:sybase://host_name:port_no;DatabaseName=
        Verify that the connection string contains all the connection parameters required by your database system.
  37. If you create a secure domain configuration repository, enter the parameters for the secure database.
    If you create the domain configuration repository on a secure database, you must provide the truststore information for the database. You must also provide a JDBC connection string that includes the security parameters for the database.
    The following table describes the options available to create a secure domain configuration repository database:
    Property
    Description
    Database truststore file
    Path and file name of the truststore file for the secure database.
    Database truststore password
    Password for the truststore file.
    Custom JDBC Connection String
    Complete JDBC connection for the secure database, including the host name and port number and the secure database parameters.
    In addition to the host name and port number for the database server, you must include the following secure database parameters:
    EncryptionMethod
    Required. Indicates whether data is encrypted when transmitted over the network. This parameter must be set to
    SSL
    .
    ValidateServerCertificate
    Optional. Indicates whether Informatica validates the certificate that the database server sends.
    If this parameter is set to True, Informatica validates the certificate that the database server sends. If you specify the HostNameInCertificate parameter, Informatica also validates the host name in the certificate.
    If this parameter is set to False, Informatica does not validate the certificate that the database server sends. Informatica ignores any truststore information that you specify.
    Default is True.
    HostNameInCertificate
    Optional. Host name of the machine that hosts the secure database. If you specify a host name, Informatica validates the host name included in the connection string against the host name in the SSL certificate.
    cryptoProtocolVersion
    Required. Specifies the cryptographic protocol to use to connect to a secure database. You can set the parameter to
    cryptoProtocolVersion=TLSv1.1
    or
    cryptoProtocolVersion=TLSv1.2
    based on the cryptographic protocol used by the database server.
    You can use the following syntax for the connection strings:
    • Oracle:
      jdbc:Informatica:oracle://host_name:port_no;ServiceName=service_name;EncryptionMethod=SSL;HostNameInCertificate=DB_host_name;ValidateServerCertificate=true_or_false
    • IBM DB2:
      jdbc:Informatica:db2://host_name:port_no;DatabaseName=database_name;EncryptionMethod=SSL;HostNameInCertificate=DB_host_name;ValidateServerCertificate=true_or_false
    • Microsoft SQL Server:
      jdbc:Informatica:sqlserver://host_name:port_no;SelectMethod=cursor;DatabaseName=database_name;EncryptionMethod=SSL;HostNameInCertificate=DB_host_name;ValidateServerCertificate=true_or_false
    The installer does not validate the connection string. Verify that the connection string contains all the connection parameters and security parameters required by your database.
  38. If the database contains a domain configuration repository for a previous domain, choose to overwrite the data or set up another database.
    The following table describes the options of overwriting the data or setting up another database when you create a domain configuration repository for a previous domain:
    Option
    Description
    1 - OK
    Enter the connection information for a new database.
    2 - Continue
    The installer overwrites the data in the database with new domain configuration.
  39. In the
    Domain Security - Encryption Key
    section, enter the keyword and encryption key directory for the Informatica domain.
    The following table describes the encryption key parameters that you must specify:
    Property
    Description
    Keyword
    Keyword to use to create a custom encryption key to secure sensitive data in the domain. The keyword must meet the following criteria:
    • From 8 to 20 characters long
    • Includes at least one uppercase letter
    • Includes at least one lowercase letter
    • Includes at least one number
    • Does not contain spaces
    The encryption key is created based on the keyword that you provide when you create the Informatica domain.
    Encryption key directory
    Directory in which to store the encryption key for the domain. The default location is the following directory:
    <Informatica installation directory>/isp/config/keys
    .
    The installer sets different permissions to the directory and the files in the directory.
  40. Press
    Enter
    .
    The
    Domain and Node Configuration
    section appears.
  41. Enter the information for the domain and the node that you want to create.
    The following table describes the properties that you set for the domain and gateway node.
    Property
    Description
    Domain name
    Name of the domain to create. The default domain name is Domain_<MachineName>. The name must not exceed 128 characters and must be 7-bit ASCII only. It cannot contain a space or any of the following characters: ` % * + ; " ? , < > \ /
    Node host name
    Host name of the machine on which to create the node. The node host name cannot contain the underscore (_) character. If the machine has a single network name, use the default host name. If the a machine has multiple network names, you can modify the default host name to use an alternate network name. Optionally, you can use the IP address.
    Do not use localhost. The host name must explicitly identify the machine.
    Node name
    Name of the node to create on this machine. The node name is not the host name for the machine.
    Node port number
    Port number for the node. The default port number for the node is 6005. If the port number is not available on the machine, the installer displays the next available port number.
    Domain user name
    User name for the domain administrator. You can use this user name to initially log in to Informatica Administrator. Use the following guidelines:
    • The name is not case sensitive and cannot exceed 128 characters.
    • The name cannot include a tab, newline character, or the following special characters: % * + / ? ; < >
    • The name can include an ASCII space character except for the first and last character. Other space characters are not allowed.
    Domain password
    Password for the domain administrator. The password must be more than 2 characters and must not exceed 16 characters.
    Confirm password
    Enter the password again to confirm.
  42. Select whether to display the default ports for the domain and node components assigned by the installer.
    The following table describes the advanced port configuration page:
    Prompt
    Description
    Display advanced port configuration page
    Select whether to display the port numbers for the domain and node components assigned by the installer:
    1 - No
    2 - Yes
    If you select Yes, the installer displays the default port numbers assigned to the domain components. You can specify the port numbers to use for the domain and node components. You can also specify a range of port numbers to use for the service process that will run on the node. You can use the default port numbers or specify new port numbers. Verify that the port numbers you enter are not used by other applications.
  43. If you display the port configuration page, enter new port numbers at the prompt or press
    Enter
    to use the default port numbers.
    Port
    Description
    Service Manager port
    Port number used by the Service Manager on the node. The Service Manager listens for incoming connection requests on this port. Client applications use this port to communicate with the services in the domain. The Informatica command line programs use this port to communicate to the domain. This is also the port for the SQL data service JDBC/ODBC driver. Default is 6006.
    Service Manager Shutdown port
    Port number that controls server shutdown for the domain Service Manager. The Service Manager listens for shutdown commands on this port. Default is 6007.
    Informatica Administrator port
    Port number used by Informatica Administrator. Default is 6008.
    Informatica Administrator HTTPS port
    No default port. Enter the required port number when you create the service. Setting this port to 0 disables an HTTPS connection to the Administrator tool.
    Informatica Administrator shutdown port
    Port number that controls server shutdown for Informatica Administrator. Informatica Administrator listens for shutdown commands on this port. Default is 6009.
    Minimum port number
    Lowest port number in the range of dynamic port numbers that can be assigned to the application service processes that run on this node. Default is 6014.
    Maximum port number
    Highest port number in the range of dynamic port numbers that can be assigned to the application service processes that run on this node. Default is 6114.
  44. Choose whether you want to create Model Repository Service, Data Integration Service, and Catalog Service as part of the installation. You can create these services after installation in Informatica Administrator. Press
    1
    to create the services, or press
    2
    to complete the installation without creating the services.
    If you pressed 1, the
    Model Repository Service Database
    section appears.
  45. If you pressed 1, choose the database type, and enter the database parameters for the Model repository.
  46. Choose whether you want to configure a secure database. Press
    1
    to configure a secure database, or press
    2
    to skip the step.
  47. To configure JDBC connection information, press
    1
    and enter the JDBC parameters. Press
    2
    to skip configuring the JDBC connection.
  48. Choose the database type for the Model repository, and enter the credentials including the database user ID and user password.
  49. Optionally, configure the JDBC connection and its parameters.
  50. Enter the following information: Model Repository Service name , Data Integration Service name, and the port number for the Data Integration Service if you do not want to use the default value.
    Option
    Description
    MRS name
    Name of the Model Repository Service.
    DIS name
    Name of the Data Integration Service.
    HTTP protocol type
    Security protocol that the Data Integration Service uses.
    Port
    Port number.
    You see messages about creating Model Repository Service and Data Integration Service.
  51. Enter the following required information in addition to the Model Repository Service and Data Integration Service to create the profiling warehouse and reference data warehouse databases:
    Reference data warehouse database type
    Database type for the reference data warehouse. The reference data warehouse supports IBM DB2 UDB, Microsoft SQL Server, or Oracle.
    Reference data warehouse database host name
    The name of the machine hosting the reference data warehouse.
    Profiling warehouse database type
    Database type for the profiling warehouse. The profiling warehouse supports IBM DB2 UDB, Microsoft SQL Server, or Oracle.
    Profiling warehouse database host name
    The name of the machine hosting the profiling warehouse.
The Post-installation Summary indicates whether the installation completed successfully. You can view the installation log files to get more information about the tasks performed by the installer and to view configuration properties for the installed components.