Table of Contents

Search

  1. Preface
  2. Introduction to Databricks Delta Connector
  3. Connections for Databricks Delta
  4. Mappings for Databricks Delta
  5. Migrating a mapping
  6. Databricks Delta SQL ELT optimization
  7. Data type reference

Databricks Delta Connector

Databricks Delta Connector

Optimize the staging performance of a mapping

Optimize the staging performance of a mapping

Data Integration, by default, creates a flat file locally in a temporary folder to stage the data before writing to Databrick Delta. You can set Data Integration to optimize the staging performance of the write operation on a Linux machine.
If you do not set the staging property, Data Integration performs staging without the optimized settings, which might impact the performance of the task.
You can optimize the staging performance only when you use the SQL warehouse to run the mappings.

Setting the staging property

You can optimize the mapping performance of the write operation on Linux.
You need to set the following staging property in the agent properties:
INFA_DTM_STAGING_ENABLED_CONNECTORS
Perform the following tasks to set the staging property for the Tomcat in the Secure Agent properties:
  1. In Administrator, click
    Runtime Environments
    .
  2. Edit the Secure Agent for which you want to set the property.
  3. In the
    System Configuration Details
    section, select the
    Service
    as
    Data Integration Server
    and the type as
    Tomcat
    .
  4. Set the value of the Tomcat property to the plugin ID of Databricks Delta Connector.
    You can find the plugin ID in the manifest file located in the following directory:
    <Secure Agent installation directory>/downloads/<Databricks Delta package>/CCIManifest
    The following image shows the property set for the Secure Agent for the write operation: This image shows the staging property you can set for the write operation
When you run the mapping, a flat file is created in the following directory in your Linux machine:
\tmp\DELTA_stage
You can check the session logs. If the staging is done through the flat file successfully, Data Integration logs the following message in the session log:
The INFA_DTM_STAGING is successfully enabled to use the flat file to create local staging files.

Rules and guidelines

Consider the following guidelines when you enable the staging property for the write operation:
  • If a mapping enabled for SQL ELT optimization runs without SQL ELT optimization because of any issues, Data Integration performs staging without the optimized settings even though you set the staging property to optimize the staging performance.
  • When you read binary data and the data size exceeds 50 MB in a single cell, you need to adjust the default DTM staging file size of 50 MB set in the advanced target properties to a higher value proportional to your data size. However, ensure that the data size within a single cell does not exceed 70 MB to prevent data truncation in the target.
  • When you write values of double data type to a Databricks Delta target, the values are rounded off. To write exact values from the source to the target, use the decimal data type.

0 COMMENTS

We’d like to hear from you!