Databricks Connector

Back Next

Rules and guidelines for SQL ELT optimization

Rules and guidelines for
SQL ELT optimization

Use the following rules and guidelines when you enable a mapping for

SQL ELT optimization

to a Databricks database:

General rules and guidelines

Use the following rules and guidelines when you configure

SQL ELT optimization

in a mapping:

When you use the IN() function with OR, AND, or NOT operator in an IIF() function, the IIF() function returns incorrect results.

When you configure mapping to read from an Amazon S3 or a Microsoft Azure Data Lake Storage Gen2 source, you cannot use Databricks connection in the Lookup transformation in the mapping.

Mapping with Databricks source and target

Use the following rules and guidelines when you configure

SQL ELT optimization

in a mapping that reads from and writes to Databricks:

When you provide the database name in the connection, the following rules and guidelines apply:

For full

SQL ELT optimization

, temporary staging tables or views are created in the database name provided in the target connection attribute

database

. If the attribute is empty the database name is picked from JDBC URL of the target connection.

For Source

SQL ELT optimization

, temporary staging tables or views are created in the database name provided in the source connection attribute

database

. If the attribute is empty the database name is picked from JDBC URL of source connection.

The database name provided in the connection should have read and write permission. If the

database name

field is empty, then the database name provided in the JDBC URL will take precedence and should have read and write permission.

The guidelines are only applicable to custom query support, SQL override, and

SQL ELT optimization

in Databricks using flat file formats.

You cannot use the Data Driven target operation with the Update Mode advanced target property.

LAST function is a non-deterministic function. This function returns different results each time it is called, even when you provide the same input values.

When you configure a Filter transformation or specify a filter condition, do not specify special characters.

When you configure a mapping enabled for full

SQL ELT optimization

to read from multiple sources and you override the database name and table name from the advanced properties, the mapping fails.

To configure a Filter transformation or specify a filter condition on columns of date or timestamp in a Databricks table, you must pass the data through the TO_DATE() function as an expression in the filter condition.

When you configure a mapping enabled for full

SQL ELT optimization

on the Databricks SQL engine and push the data to the Databricks target, ensure that you map all the fields in target. Else, the mapping fails.

When you create a new target at runtime, you must not specify a database name and table name in the

Target Database Name

and

Target Table Name

in the target advanced properties.

When you read data from a column of Date data type and write data into a column of Date data type, the SQL ELT query pushes the column of Date data type and casts the column to Timestamp data type.

You cannot completely parameterize a multi-line custom query using a parameter file. If you specify a multi-line custom query in a parameter file, the mapping considers only the first line of the multi-line query.

When you push the CRC32() funtion to Databricks, the data type of the return value is either Bigint or String.

When you push the DATE_DIFF() function to Databricks, the function returns the integral part of the value and not the fractional part.

When you push the GREATEST() function to Databricks and configure input value arguments of String data type, you must not specify the caseFlag argument.

To push the TO_CHAR(DATE) function to Databricks, use the following string and format arguments:

YYYY

MON

MONTH

DDD

DAY

HH12

HH24

SS.MS

SS.US

SS.NS

To push the TO_DATE(string, format) function to Databricks, you must use the following format arguments:

YYYY

MON

MONTH

DDD

HH12

HH24

SS.MS

SS.US

SS.NS

When you enable full

SQL ELT optimization

in a mapping and use the IFF() condition in an Expression transformation, the mapping fails for the following functions:

IS_SPACES

IS_NUMBER

IS_DATE

A mapping enabled with full

SQL ELT optimization

and contains an SQL transformation fails when the column names in the SQL override query don't match with the column names in the custom query.

Mapping with Amazon S3 source and Databricks target

Use the following rules and guidelines when you configure

SQL ELT optimization

in a mapping that reads from an Amazon S3 source and writes to a Databricks target:

When you use an unconnected lookup and run a Databricks mapping enabled for SQL ELT optimization, the mapping fails if the IIF() function returns a Float value in the lookup query.

When you select the source type as directory in the advanced source properties, ensure that all the files in the directory contain the same schema.

When you select query as the source type in lookup, you cannot override the database name and table name in the advanced source properties.

When you include a source transformation in a mapping enabled with

SQL ELT optimization

, exclude the FileName field from the source. The FileName field is not applicable.

When you parameterize a lookup object in a mapping enabled with

SQL ELT optimization

, the mapping fails as you cannot exclude the filename port at runtime.

When you parameterize the source object in a mapping task, ensure that you pass the source object parameter value with the fully qualified path in the parameter file.

You cannot use wildcard characters for the source file name and directory name in the source transformation.

You cannot use wildcard characters for the folder path or file name in the advanced source properties.

You cannot configure dynamic lookup cache.

When you use a Joiner transformation in a mapping enabled with

SQL ELT optimization

and create a new target at runtime, ensure that the fields do not have a not null constraint.

Ensure that the field names in Parquet, ORC, AVRO, or JSON files do not contain Unicode characters.

Mapping with Azure Data Lake Storage Gen2 source and Databricks target

Use the following rules and guidelines when you configure

SQL ELT optimization

in a mapping that reads from a Azure Data Lake Storage Gen2 source and writes to a Databricks target:

Mappings fail if the lookup object contains unsupported data types.

When you select the source type as directory in the advanced source property, ensure that all the files in the directory contain the same schema.

When you select query as the source type in lookup, you cannot override the database name and table name in the advanced source properties.

When you include a source transformation in a mapping enabled with

SQL ELT optimization

, exclude the FileName field from the source. The FileName field is not applicable.

When you parameterize a lookup object in a mapping enabled with

SQL ELT optimization

, the mapping fails as you cannot exclude the filename port at runtime.

When you parameterize the source object in a mapping task, ensure that you pass the source object parameter value with the fully qualified path in the parameter file.

You cannot use wildcard characters for the source file name and directory name in the source transformation.

You cannot configure dynamic lookup cache.

When you use a Joiner transformation in a mapping enabled with

SQL ELT optimization

and create a new target at runtime, ensure that the fields do not have a not null constraint.

Ensure that the field names in Parquet, ORC, AVRO, or JSON files do not contain Unicode characters.

Cross workspace mappings

When you set up a mapping enabled with full

SQL ELT optimization

to access data from a Databricks workspace, and the associated metastore resides in a separate workspace, the mapping runs without

Rename Saved Search

Table of Contents

Databricks Connector

Databricks Connector

Rules and guidelines for SQL ELT optimization

Rules and guidelines for SQL ELT optimization

General rules and guidelines

Mapping with Databricks source and target

Mapping with Amazon S3 source and Databricks target

Mapping with Azure Data Lake Storage Gen2 source and Databricks target

Cross workspace mappings

Rules and guidelines for
SQL ELT optimization