Installation for Data Engineering

10.5.3
- 10.5.6
- 10.5.5.1
- 10.5.5
- 10.5.4
- 10.5.2
- 10.5.1
- 10.5
- 10.4.1
- 10.4.0
- 10.2.2 HotFix 1
- 10.2.2 Service Pack 1
- 10.2.2
- 10.2.1

Back Next

Prepare for Cluster Import

When you run the installer, you can choose to configure the cluster. The cluster configuration enables the Data Integration Service to push mapping logic to the cluster. To integrate the Informatica domain with the non-native cluster, you must import a cluster configuration. You can import the cluster information directly from the cluster or from an archive file.

You can import cluster information from an archive file of any supported cluster into the domain. Your administrator might prefer to provide you with the archive file to protect sensitive connection information to the cluster. The archive file can be in a .zip or .tar format. Ensure that you store the archive file locally.

Prepare the Archive File for Hadoop Environment

To import the cluster configuration from Amazon EMR, MapR, or Google Dataproc cluster, you must import from an archive file. The Hadoop cluster configuration archive file can have the following contents based on the distribution:

core-site.xml

hbase-site.xml. hbase-site.xml is required only if you access HBase sources and targets.

hdfs-site.xml

hive-site.xml

mapred-site.xml or tez-site.xml. Include the mapred-site.xml file or the tez-site.xml file based on the Hive execution type used on the Hadoop cluster.

yarn-site.xml

When you configure a CDP Public Cloud cluster, the hbase-site.xml file is on the Data Lake cluster. The other files are on the Data Hub cluster.

Prepare the Archive File for the Databricks Environment

To create the .xml file for import, you must get the required information from the Databricks administrator. You can provide any name for the file and store it locally.

The following table describes the cluster properties required to configure in the import file for the Databricks environment:

Property Name	Description
cluster_name	Name of the Databricks cluster.
cluster_ID	The cluster ID of the Databricks cluster.
base URL	URL to access the Databricks cluster.
accesstoken	Token ID created within Databricks required for authentication.

Optionally, you can include other properties specific to the Databricks environment. When you complete the .xml file, compress it into a .zip or .tar file for import.

Rename Saved Search

Table of Contents

Installation for Data Engineering

Installation for Data Engineering

Prepare for Cluster Import

Prepare for Cluster Import

Prepare the Archive File for Hadoop Environment

Prepare the Archive File for the Databricks Environment