Table of Contents

Search

  1. Preface
  2. Installing MDM Big Data Relationship Management
  3. Configuring MDM Big Data Relationship Management
  4. Configuring Security
  5. Setting Up the Environment to Process Streaming Data
  6. Configuring Distributed Search
  7. Packaging and Deploying the RESTful Web Services

Installation and Configuration Guide

Installation and Configuration Guide

Configuring MDM Big Data Relationship Management to Use Kerberos Authentication

Configuring MDM Big Data Relationship Management to Use Kerberos Authentication

If you use Kerberos for authentication in your environment, you can use the Kerberos authentication for the batch jobs to access data in HBase and Hive. The authentication process prevents any unauthorized access to the data.
Ensure that you have an existing Kerberos authentication infrastructure.
  1. Open the configuration file in a text editor.
  2. To configure the Kerberos authentication for HBase, ensure that you add the following parameters within the HBASEConfiguration section:
    KeyTabFile
    Optional. Absolute path and file name of the keytab file. A keytab file contains a list of keys that are analogous to user passwords. Applicable if you use Kerberos for authentication.
    If you use the KeyTabFile parameter, ensure that the name of the file and the absolute path to the file are the same for all the nodes in a distributed Hadoop cluster.
    PrincipalName
    Required if you use Kerberos for authentication. Service Principal Name (SPN) of the HBase master server. For example, hbase/_Host@realm.com.
    You can get the SPN of the HBase master server from the hbase.master.kerberos.principal property in the following file: ${HBASE_HOME}/conf/hbase-site.xml
    For example, the following sample code contains the parameters related to the Kerberos authentication for HBase:
    <HBASEConfiguration>
       <HbaseMaster>HadoopServer:60000</HbaseMaster>
       <HbaseZookeeperClientPort>2181</HbaseZookeeperClientPort>
       <HbaseZookeeperQuorum>iir-hadoop-test1</HbaseZookeeperQuorum>
       <HbaseRootDirectory />
       <HbaseDistributed>true</HbaseDistributed>
       <HbaseZookeeperZnodeParent />
       <HbaseCompressionAlgorithm>SNAPPY</HbaseCompressionAlgorithm>
       <HbaseDataBlockEncoding>PREFIX</HbaseDataBlockEncoding>
       <ScanCacheSize>100000</ScanCacheSize>
       <CacheBlock>false</CacheBlock>
       <AutoFlush>false</AutoFlush>
       <WALonPUT>false</WALonPUT>
       <ScanBatchSize>100</ScanBatchSize>
       <EnableSmallScan>false</EnableSmallScan>
       <RegionSplitSize>8</RegionSplitSize>
       <DriverName>com.informatica.mdmbde.database.hbase.HBaseDatabaseAdapterImpl</DriverName>
       <SearchTokenValidity>1000</SearchTokenValidity>
       <KeyTabFile>/etc/security/keytabs/hbase.keytab</KeyTabFile>
       <PrincipalName>hbase/_Host@realm.com</PrincipalName>
    </HBASEConfiguration>
  3. To configure the Kerberos authentication for Hive, ensure that you add the following parameters within the HiveConfiguration section:
    KeyTabFile
    Optional. Absolute path and file name of the keytab file. A keytab file contains a list of keys that are analogous to user passwords. Applicable if you use Kerberos for authentication.
    If you use the KeyTabFile parameter, ensure that the name of the file and the absolute path to the file are the same for all the nodes in a distributed Hadoop cluster.
    PrincipalName
    Required if you use Kerberos for authentication. Service Principal Name (SPN) of the Hive master server. For example, hive/_Host@realm.com.
    You can get the SPN of the Hive master server from the hive.metastore.kerberos.principal property in the following file: ${HIVE_HOME}/conf/hive-site.xml
    JDBCUrl
    JDBC connection URL to access metadata from Hive.
    Use the following format for the JDBC connection URL:
    jdbc:hive2://<Host Name>:<Port>/default
    Host Name indicates the name of the machine that hosts the Hive master server, and Port indicates the port number on which the Hive master server listens.
    For example, the following sample code contains the parameters related to the Kerberos authentication for Hive:
    <HiveConfiguration>
       <JDBCUrl>jdbc:hive2://myhost:10000/default</JDBCUrl>
       <KeyTabFile>/etc/security/keytabs/hive.keytab</KeyTabFile>
       <PrincipalName>hive/_Host@realm.com</PrincipalName>
    </HiveConfiguration>
  4. Save the configuration file.
Run the kinit command with the same keytab file name and service principal name that you specify in the configuration file before you run a batch job.


Updated June 27, 2019