Table of Contents

Search

  1. Preface
  2. Part 1: Installation Getting Started
  3. Part 2: Before You Install the Services
  4. Part 3: Run the Services Installer
  5. Part 4: After You Install the Services
  6. Part 5: Informatica Client Installation
  7. Part 6: Uninstallation
  8. Appendix A: Starting and Stopping Informatica Services
  9. Appendix B: Managing Distribution Packages
  10. Appendix C: Connecting to Databases from UNIX or Linux
  11. Appendix D: Updating the DynamicSections Parameter of a DB2 Database

Installation for Data Engineering

Installation for Data Engineering

Prepare for Cluster Import

Prepare for Cluster Import

When you run the installer, you can choose to configure the cluster. The cluster configuration enables the Data Integration Service to push mapping logic to the cluster. To integrate the Informatica domain with the non-native cluster, you must import a cluster configuration. You can import the cluster information directly from the cluster or from an archive file.
You can import cluster information from an archive file of any supported cluster into the domain. Your administrator might prefer to provide you with the archive file to protect sensitive connection information to the cluster. The archive file can be in a .zip or .tar format. Ensure that you store the archive file locally.

Prepare the Archive File for Hadoop Environment

To import the cluster configuration from Amazon EMR, MapR, or Google Dataproc cluster, you must import from an archive file. The Hadoop cluster configuration archive file can have the following contents based on the distribution:
  • core-site.xml
  • hbase-site.xml. hbase-site.xml is required only if you access HBase sources and targets.
  • hdfs-site.xml
  • hive-site.xml
  • mapred-site.xml or tez-site.xml. Include the mapred-site.xml file or the tez-site.xml file based on the Hive execution type used on the Hadoop cluster.
  • yarn-site.xml
When you configure a CDP Public Cloud cluster, the hbase-site.xml file is on the Data Lake cluster. The other files are on the Data Hub cluster.

Prepare the Archive File for the Databricks Environment

To create the .xml file for import, you must get the required information from the Databricks administrator. You can provide any name for the file and store it locally.
The following table describes the cluster properties required to configure in the import file for the Databricks environment:
Property Name
Description
cluster_name
Name of the Databricks cluster.
cluster_ID
The cluster ID of the Databricks cluster.
base URL
URL to access the Databricks cluster.
accesstoken
Token ID created within Databricks required for authentication.
Optionally, you can include other properties specific to the Databricks environment. When you complete the .xml file, compress it into a .zip or .tar file for import.

0 COMMENTS

We’d like to hear from you!