Table of Contents

Search

  1. Preface
  2. Introduction to Hadoop Integration
  3. Before You Begin
  4. Amazon EMR Integration Tasks
  5. Azure HDInsight Integration Tasks
  6. Cloudera CDH Integration Tasks
  7. Hortonworks HDP Integration Tasks
  8. MapR Integration Tasks
  9. Appendix A: Connections

Hadoop Integration Guide

Hadoop Integration Guide

Verify Product Installations

Verify Product Installations

Before you begin the Big Data Management integration between the domain and Hadoop environments, verify that Informatica and third-party products are installed.
You must install the following products:
Informatica domain and clients
Install and configure the Informatica domain and the Developer tool. The Informatica domain must have a Model Repository Service, a Data Integration Service, and a Metadata Access Service.
Hadoop File System and MapReduce
The Hadoop installation must include a Hive data warehouse with a non-embedded database for the Hive metastore. Verify that Hadoop is installed with Hadoop File System (HDFS) and MapReduce on each node. Install Hadoop in a single node environment or in a cluster. For more information, see the Apache website: http://hadoop.apache.org.
Database client software
Install the database client software to perform database read and write operations in native mode. Informatica requires the client software to run MapReduce or Tez jobs on the Hive engine. For example, install the Oracle client to connect to an Oracle database.

0 COMMENTS

We’d like to hear from you!