Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

Table of Contents

Search

  1. Preface
  2. Getting started with asset modernization
  3. Repository Assessment
  4. Asset Conversion
  5. Bulk Metadata Update
  6. Jobs
  7. Configurations

PowerCenter Modernization

PowerCenter Modernization

Databricks Delta advanced properties

Databricks Delta advanced properties

The following tables lists the Databricks Delta properties that you can configure for a
Connection Map configuration
:

Object properties

The following table describes the Databricks Delta connection properties:
Connection property
Description
Schema
The Databricks Delta schema name to be used when creating the object. Th schema name is similar to the schema name specified in the
SCHEMAMAP.properties
file.
Default is
schema_changeit
.

Source advanced properties

The following table describes the Databricks Delta source advanced properties:
Property name
Description
Schema Name
Overrides the schema specified in the connection.
Default is
$$dbd_schema
.
Staging Location
Relative directory path to store the staging files.
If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket.
If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name.
Default is
$$dbd_staging_loc
.
Database Name
Overrides the database name provided in connection and the database name provided during metadata import.
To read from multiple objects ensure that you have specified the database name in the connection properties.
Job Timeout
Maximum time in seconds that is taken by the Spark job to complete processing.
If the job is not completed within the time specified, the Databricks cluster terminates the job and the mapping fails.
If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval
Poll interval in seconds at which the Secure Agent checks the status of the job completion.
Default is 30 seconds.
DB REST API Timeout
The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns
5xx HTTP
error code.
Default is 10 minutes.
DB REST API Retry Interval
The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns
5xx HTTP
error code.
This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API.
Default is 30 seconds.

Target advanced properties

The following table describes the Databricks Delta target advanced properties:
Property Name
Description
Schema Name
Overrides the schema specified in the connection.
Default is
$$dbd_schema
.
Staging Location
Relative directory path to store the staging files.
If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket.
If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name.
Default is
$$dbd_staging_loc
.
Target Database Name
Overrides the database name provided in the connection and the database selected in the metadata browser for existing targets.
Write Disposition
Overwrites or adds data to the existing data in a table. You can select from the following options:
  • Append. Appends data to the existing data in the table even if the table is empty.
  • Truncate. Overwrites the existing data in the table. Only applies to Insert operation and non-empty sources.
  • Truncate Always. Overwrites the existing data in the table. Applies to insert, update, upsert, and delete target operations for empty and non-empty sources.
DTM Staging File Size
The size of the flat file that Data Integration creates locally in a temporary folder to stage the data before writing to Databrick.
Default is 50 MB.
Job Timeout
Maximum time in seconds that is taken by the Spark job to complete processing.
If the job is not completed within the time specified, the job terminates and the mapping fails.
If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval
Poll interval in seconds at which the Secure Agent checks the status of the job completion.
Default is 30 seconds.
DB REST API Timeout
The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns
5xx HTTP
error code.
Default is 10 minutes.
DB REST API Retry
The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns
5xx HTTP
error code.
This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API.
Default is 30 seconds.

Lookup advanced properties

The following table describes the Databricks Delta lookup transformation advanced properties:
Property Name
Description
Schema Name
Overrides the schema specified in the connection.
Default is
$$dbd_schema
.
Staging Location
Relative directory path to store the staging files.
If the Databricks cluster is deployed on AWS, use the path relative to the Amazon S3 staging bucket.
If the Databricks cluster is deployed on Azure, use the path relative to the Azure Data Lake Store Gen2 staging filesystem name.
Default is
$$dbd_staging_loc
.
Database Name
Overrides the database specified in the connection.
Job Timeout
Maximum time in seconds that is taken by the Spark job to complete processing. If the job is not completed within the time specified, the Databricks cluster terminates the job and the mapping fails.
If the job timeout is not specified, the mapping shows success or failure based on the job completion.
Job Status Poll Interval
Poll interval in seconds at which the Secure Agent checks the status of the job completion. Default is 30 seconds.
DB REST API Timeout
The Maximum time in seconds for which the Secure Agent retries the REST API calls to Databricks when there is an error due to network connection or if the REST endpoint returns
5xx HTTP
error code.
Default is 10 minutes.
DB REST API Retry Interval
The time Interval in seconds at which the Secure Agent must retry the REST API call, when there is an error due to network connection or when the REST endpoint returns
5xx HTTP
error code.
This value does not apply to the Job status REST API. Use job status poll interval value for the Job status REST API.
Default is 30 seconds.

0 COMMENTS

We’d like to hear from you!