Table of Contents


  1. Preface
  2. Part 1: Hadoop Integration
  3. Part 2: Databricks Integration
  4. Appendix A: Connections

Native Environment

Native Environment

The integration with Databricks requires tools, services, and a repository database in the Informatica domain.

Clients and Tools

When the Informatica domain is integrated with Databricks, you can use the following tools:
Informatica Administrator
Use the Administrator tool to mange the Informatica domain and application services. You can also create objects such as connections, cluster configurations, and cloud provisioning configurations to enable big data operations.
The Developer tool
Use the Developer tool to import sources and targets and create mappings to run in the Databricks environment.

Application Services

The domain integration with Databricks uses the following services:
Data Integration Service
The Data Integration Service can process mappings in the native environment, or it can push the processing to the Databricks environment. The Data Integration Service retrieves metadata from the Model repository when you run a mapping.
Model Repository Service
The Model Repository Service manages the Model repository. All requests to save or access Model repository metadata go through the Model repository.

Model Repository

The Model repository stores mappings that you create and manage in the Developer tool.