Configure Informatica Cloud Data Integration-Free through Databricks Partner Connect

Configure Informatica Cloud Data Integration-Free through Databricks Partner Connect

Creating a mapping to load data to Databricks

Creating a mapping to load data to Databricks

To load data from your data source to your Databricks lakehouse, you create a mapping. A mapping defines reusable data flow logic that you can use to load data to Databricks. When you download and install
Data Integration
through Partner Connect, a connection to your Databricks destination is automatically configured for you.
A mapping needs a runtime environment. A runtime environment is the execution platform that runs
Informatica Intelligent Cloud Services
assets such as tasks and taskflows. You must have at least one runtime environment in your organization so that you can run tasks.
If you're reading data from a cloud data warehouse, you can use the Informatica Cloud Hosted Agent as your runtime environment. The Hosted Agent is managed by Informatica and is already installed for you.
If you're reading data from an on-premises or other type of data source, you can download and install a Secure Agent from the
Home
page or from the
Runtime Environments
page in Administrator. For more information, see Configuring a runtime environment in the
Getting Started
guide.
  1. From the
    Home
    page in
    Data Integration
    , choose
    Transform data using a mapping
    .
    You can also choose
    New
    Mapping
    from the navigation menu on the left.
  2. In the Mapping Designer, select the Target transformation.
    The Target transformation represents your data destination.
  3. On the
    Target
    tab, choose the connection to your Databricks destination.
    Note that the connection is created for you based on the catalog you chose when you installed
    Data Integration
    from Partner Connect.
  4. Click
    Select
    next to
    Object
    , choose the table that you want to load data to, and click
    OK
    .
    Note that the schema you chose when you installed
    Data Integration
    appears in the
    Packages
    list.
  5. In the Mapping Designer, select the Source transformation.
    The Source transformation represents your data source. The source is the table or object that you're reading the data from.
  6. On the
    Source
    tab, select
    New Connection
    .
  7. Enter a connection name and choose the type, for example, Google BigQuery V2.
  8. Select a runtime environment and enter the connection details.
    If you're reading data from a cloud data warehouse, you can select the Informatica Cloud Hosted Agent. Otherwise, you'll need to create a runtime environment before you can create the connection. For more information, see Configuring a runtime environment in the
    Getting Started
    guide.
  9. Click
    Test
    to test the connection to your data source.
  10. When the connection is successful, click
    OK
    .
    Data Integration
    creates a connection to the data source.
  11. Click
    Select
    next to
    Object
    , choose the table or object that you want to read data from, and click
    OK
    .
  12. Enter the source details, if required.
    The source details vary based on the connection type.
  13. Optionally, add transformations to the mapping.
  14. Click
    Save
    to save and validate the mapping.
  15. Click
    Run
    to run the mapping.
    To monitor the mapping run, select
    My Jobs
    in the navigation menu on the left.
When the mapping runs successfully, you can create a mapping task to run the mapping on demand or on a schedule.

0 COMMENTS

We’d like to hear from you!