Table of Contents


  1. Preface
  2. Part 1: Hadoop Integration
  3. Part 2: Databricks Integration
  4. Appendix A: Connections

Prepare the Archive File for Import from Azure HDInsight

Prepare the Archive File for Import from Azure HDInsight

When you prepare the archive file for cluster configuration import from HDInsight, include all required *-site.xml files and edit the file manually after you create it.
Create a .zip or .tar file that contains the following *-site.xml files:
  • core-site.xml
  • hbase-site.xml. Required only to access HBase sources and targets.
  • hdfs-site.xml
  • hive-site.xml
  • mapred-site.xml or tez-site.xml. Include the mapred-site.xml file or the tez-site.xml file based on the Hive execution type used on the Hadoop cluster.
  • yarn-site.xml

Update the Archive File

After you create the archive file, edit the Hortonworks Data Platform (HDP) version string wherever it appears in the archive file. Search for the string
and replace all instances with the HDP version that HDInsight includes in the Hadoop distribution.
For example, the edited tez.task.launch.cluster-default.cmd-opts property value looks similar to the following:
<property> <name>tez.task.launch.cluster-default.cmd-opts</name> <value>-server -Dhdp.version=</value> </property>