To load data from your data source to your Databricks lakehouse, you create a mapping. A mapping defines reusable data flow logic that you can use to load data to Databricks. When you download and install
Data Integration
through Partner Connect, a connection to your Databricks destination is automatically configured for you.
A mapping needs a runtime environment. A runtime environment is the execution platform that runs
Informatica Intelligent Cloud Services
assets such as tasks and taskflows. You must have at least one runtime environment in your organization so that you can run tasks.
If you're reading data from a cloud data warehouse, you can use the Informatica Cloud Hosted Agent as your runtime environment. The Hosted Agent is managed by Informatica and is already installed for you.
If you're reading data from an on-premises or other type of data source, you can download and install a Secure Agent from the
In the Mapping Designer, select the Target transformation.
The Target transformation represents your data destination.
On the
Target
tab, choose the connection to your Databricks destination.
Note that the connection is created for you based on the catalog you chose when you installed
Data Integration
from Partner Connect.
Click
Select
next to
Object
, choose the table that you want to load data to, and click
OK
.
Note that the schema you chose when you installed
Data Integration
appears in the
Packages
list.
In the Mapping Designer, select the Source transformation.
The Source transformation represents your data source. The source is the table or object that you're reading the data from.
On the
Source
tab, select
New Connection
.
Enter a connection name and choose the type, for example, Google BigQuery V2.
Select a runtime environment and enter the connection details.
If you're reading data from a cloud data warehouse, you can select the Informatica Cloud Hosted Agent. Otherwise, you'll need to create a runtime environment before you can create the connection. For more information, see
Configuring a runtime environment in the
Getting Started
guide.
Click
Test
to test the connection to your data source.
When the connection is successful, click
OK
.
Data Integration
creates a connection to the data source.
Click
Select
next to
Object
, choose the table or object that you want to read data from, and click
OK
.
Enter the source details, if required.
The source details vary based on the connection type.
Optionally, add transformations to the mapping.
Click
Save
to save and validate the mapping.
Click
Run
to run the mapping.
To monitor the mapping run, select
My Jobs
in the navigation menu on the left.
When the mapping runs successfully, you can create a mapping task to run the mapping on demand or on a schedule.