PowerExchange for Hadoop User Guide for PowerCenter

10.5
- 10.5.7
- 10.5.6
- 10.5.4
- 10.4.0

Back Next

Configure PowerCenter for Hadoop Cluster

You can configure PowerCenter and the PowerCenter Integration Service to read data from and write data to a Hadoop cluster. The Hadoop cluster could be a High Availability (HA), non-HA, kerberos-enabled, or non-kerberos cluster.

Perform the following steps to configure PowerCenter for Cloudera, Hortonworks, IBM BigInsights, and MapR distributions:

On the Informatica node where PowerCenter Integration Service runs, create a directory. The PowerCenter administrator user must have the read access on this directory. For example:

<INFA_HOME>/pwx-hadoop/conf

Copy the following files from Hadoop cluster to directory created in step 1:

/etc/hadoop/conf/core-site.xml

/etc/hadoop/conf/mapred-site.xml

/etc/hadoop/conf/hdfs-site.xml

/etc/hive/conf/hive-site.xml

Optional. Applicable to kerberos-enabled clusters. Ensure that the PowerCenter administrator user exists on all Hadoop cluster nodes and has the same UID and run kinit to create Kerberos ticket cache file on all nodes.

Optional. Applicable to kerberos-enabled clusters. Run the kinit on the Informatica node where PowerCenter Integration Service runs to create the Kerberos ticket cache file. For example:

/tmp/krb5cc_<UID>

Optional. Applicable to kerberos-enabled clusters except MapR. Edit the

core-site.xml

configuration set in the directory created in step 1 and add the following parameter:

<name>hadoop.security.kerberos.ticket.cache.path</name>

<value>/tmp/REPLACE_WTH_CACHE_FILENAME</value>

<description>Path to the Kerberos ticket cache. </description>

</property>

In the Administrator tool, go to the

Services and Nodes

tab. Select the

Processes

view for the required PowerCenter Integration Service and add the environment variable "CLASSPATH" with the value of the directory created in step 1.

Recycle the Service.

Click

Actions

Recycle Service

In the Workflow Manager, create the HDFS connection and assign to source or target and run the workflow. When you create the HDFS connection, use the value for the

fs.default.name

property for the NameNode URI. You can find the value for the

fs.default.name

property in the

core-site.xml

file.

PowerExchange for Hadoop Configuration

Download Guide

Watch

Comments

Communities

Knowledge Base