You can install Enterprise Data Catalog on an existing cluster that uses Kerberos network authentication to authenticate users and services on a network. Enterprise Data Catalog also supports SSL authentication for secure communication in the cluster.
Kerberos is a network authentication protocol which uses tickets to authenticate access to services and nodes in a network. Kerberos uses a Key Distribution Center (KDC) to validate the identities of users and services and to grant tickets to authenticated user and service accounts. In the Kerberos protocol, users and services are known as principals. The KDC has a database of principals and their associated secret keys that are used as proof of identity. Kerberos can use an LDAP directory service as a principal database.
Informatica does not support cross or multi-realm Kerberos authentication. The server host, client machines, and Kerberos authentication server must be in the same realm.
The Informatica domain requires keytab files to authenticate nodes and services in the domain without transmitting passwords over the network. The keytab files contain the service principal names (SPN) and associated encrypted keys. Create the keytab files before you create nodes and services in the Informatica domain.
Prerequisites for SSL Authentication
Verify that the existing cluster meets the following requirements before you can enable SSL authentication in the cluster:
Informatica domain is configured in the SSL mode.
The cluster and YARN REST endpoints are Kerberos-enabled.
Create a keystore file for the Apache Solr application on all nodes in the cluster. Import public certificates of Apache Solr keystore files on all the hosts into all the truststore files configured for HDFS and YARN. This step is required for Apache Spark and scanner jobs to connect to the Apache Solr application.
Import the public certificates of Apache Solr and YARN applications into the truststore file of the Informatica domain. This step is required for Catalog Service to connect to YARN and Solr applications.
Import the public certificates of Informatica domain and the Catalog Service into the YARN truststore.
Import the public certificate of the Catalog Service into the Informatica domain truststore.
If you plan to deploy Enterprise Data Catalog on an existing Hortonworks version 2.5 cluster that does not support SSL authentication, perform the following steps:
Configure the following properties in the
/etc/hadoop/conf/ssl-client.xml
file:
ssl.client.truststore.location
and
ssl.client.truststore.password
.
Ensure that the
ssl.client.truststore.location
value is set to
/opt
directory and not
/etc
directory. Verify that you configure the full path to the truststore file for the
ssl.client.truststore.location
property. For example, you can set the value similar to
/opt/truststore/infa_truststore.jks
.
Export the keystore certificate used in the Informatica domain.
Import the keystore certificate into the Informatica domain truststore file.
Place the domain truststore file in all the Hadoop nodes in the
/opt
directory. For example,
/opt/truststore/infa_truststore.jks
.
Open the
/etc/hadoop/conf/ssl-client.xml
file.
Modify the
ssl.client.truststore.location
and
ssl.client.truststore.password
properties.
Prerequisites for Kerberos Authentication
Perform the following steps before you enable the Kerberos authentication for the existing cluster:
Create the following users in the LDAP security domain where <user name> is the service cluster name.
<user name>@KERBEROSDOMAIN.COM
<user name>/<hostname>@KERBEROSDOMAIN.COM
Create the user ID for all the hosts in the cluster.
HTTP/<host name>@KERBEROSDOMAIN.COM
Create the user ID for all the hosts in the cluster.
Create a keytab file with credentials for all these users created in LDAP. You can create keytab files for each one of the users in KDC server and merge them using the
ktutil
command to create single keytab file.
Create the following folders in HDFS that Enterprise Data Catalog uses as data directories for the Catalog Service:
/Informatica/LDM/<user name>
and
/user/<user name>
.
Change the owner of these two folders to <user name>.
Create a local user with username as <user name> on all the hosts in the cluster. This step is required to launch the application on YARN as the user configured for the Catalog Service.
Set up the
udp_preference_limit
parameter in the
krb5.conf
Kerberos configuration file to 1. This parameter determines the protocol that Kerberos uses when it sends a message to the KDC. Set
udp_preference_limit = 1
to always use TCP. The Informatica domain supports only the TCP protocol. If the
udp_preference_limit
parameter is set to any other value, the Informatica domain might shut down unexpectedly.
Enterprise Data Catalog does not support deployment on a Hortonworks version 2.6 cluster where Kerberos is enabled.