The authentication process verifies the identity of a user account. With Big Data Management, user identities must be authenticated in the native Informatica domain and the non-native environment containing a Hadoop or Databricks cluster. Authentication for the Informatica domain is separate from authentication for the Hadoop cluster.
The Informatica domain uses native, LDAP, and Kerberos authentication. Native authentication stores user credentials and privileges in the domain configuration repository and performs all user authentication within the Informatica domain. LDAP authentication uses an LDAP directory service that stores user accounts and credentials that are accessed over the network.
Hadoop Authentication
By default, Hadoop does not authenticate users. Any user can be used in the Hadoop connection. Informatica recommends that you enable authentication for the cluster. If authentication is enabled for the cluster, the cluster authenticates the user account used for the Hadoop connection between Big Data Management and the cluster.
For a higher level of security, you can set up one of the following types of authentication for the cluster.:
Kerberos authentication
Kerberos is a network authentication protocol that uses tickets to authenticate users and services in a network. Users are stored in the Kerberos principal database, and tickets are issued by a KDC. User impersonation allows different users to run mappings on a Hadoop cluster that uses Kerberos authentication or connect to big data sources and targets that use Kerberos authentication.
Apache Knox Gateway
The Apache Knox Gateway is a REST API gateway that authenticates users and acts as a single access point for a Hadoop cluster.
For more information about how to enable authentication for the Hadoop cluster, see the documentation for your Hadoop distribution.
Databricks Authentication
The Data Integration Service uses token-based authentication to provide access to the Databricks environment. The Databricks administrator creates a token user and generates tokens for the user. The Databricks cluster configuration contains the token ID required for authentication.
If the token has an expiration date, verify that you get a new token from the Databricks administrator before it expires.