To replicate change data to Cloudera and Hortonworks targets on a Hadoop Distributed File System (HDFS), you must complete several prerequisite tasks to prepare the systems where the Applier and Data Replication Console run.
Install the 64-bit Java Development Kit (JDK) 1.7 or 1.8 if you have not done so already.
For Cloudera or Hortonworks targets that use Kerberos authentication, ensure that the JDK 1.7u65 or later is installed.
Define the JAVA_HOME environment variable to point to the root Java installation directory.
Add a Java library to the system path.
On Windows, add the directory that contains the jvm.dll library to the PATH environment variable. For example, use the following command:
PATH=%PATH%;%JAVA_HOME%\jre\bin\server
On Linux and UNIX, add the directory that contains the libjvm.so library to the library path environment variable for your operating system. The library path environment variables are:
Add the bin subdirectory that contains the WinUtils executable file and required .dll libraries to the HADOOP_HOME environment variable.
On Windows, for Cloudera or Hortonworks targets that use Kerberos authentication, define the DBSYNC_KERBEROS_CACHE_NAME environment variable. The environment variable points to the file that contains Kerberos credential cache.
You can get the path to the Kerberos credential cache folder from the KRB5CCNAME environment variable.
Download the hadoop_libs.zip file that Data Replication provides and that contains the .jar files. Extract this zip file into the
DataReplication_installation
directory.
Verify that the
DataReplication_installation
/lib directory contains the hadoop subdirectory.
For Cloudera and Hortonworks targets, copy the following configuration files to the
DataReplication_installation
/lib/hadoop/
hadoop_distribution
directory:
hdfs-site.xml
core-site.xml
yarn-site.xml
The yarn-site.xml file is required only if the target uses HDFS high availability.