Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

Table of Contents

Search

  1. Preface
  2. Introduction to Databricks Connector
  3. Connections for Databricks
  4. Mappings for Databricks
  5. Migrating a mapping
  6. SQL ELT with Databricks Connector
  7. Data type reference
  8. Troubleshooting

Databricks Connector

Databricks Connector

Rules and guidelines for mappings in SQL ELT mode

Rules and guidelines for mappings in SQL ELT mode

Consider the following rules and guidelines when you run mappings in SQL ELT mode:

Functions

  • When you use the following functions, ensure that all the arguments are of the same data type:

      COALESCE

      DECODE

      ELT

      GREATEST

      IF

      IFF

      IFNULL

      LEAST

      NVL

      NVL2

  • When you use the following functions, the number of optional arguments must not exceed five:

      COALESCE

      CONCAT

      CONCAT_WS

      COUNT

      ELT

      GREATEST

      HASH

      JAVA_METHOD

      LEAST

      REFLECT

      TRY_REFLECT

      XXHASH64

    In the following example of CONCAT_WS function, the first argument
    ','
    is mandatory and the other arguments are optional.
    concat_ws(',', 'This', 'is', 'a', 'Spark', 'SQL');
    The output of this function is
    This,is,a,Spark,SQL
  • When a String field with date or timestamp values in the source is mapped to a Date/Time field in the target, or a Date/Time field in the source is mapped to a String field with date or timestamp values in the target, ensure that the date and timestamp values are in the format supported by Databricks.
    For example, use
    yyyy-MM-dd
    format for date values and
    yyyy-MM-dd HH:mm:ss
    format for timestamp values in a String field.
    For more information about the date formats supported by Databricks, see Databricks datetime formats.
  • When you connect to Databricks endpoints hosted on Google Cloud Platform, you cannot use the AI functions.
  • When you use the LEAD() or LAG() function and the default value that the expression returns is of Boolean data type, specify the function in the following format:
    LEAD(flag,3,BOOLEAN(TRUE))
    or
    LAG(flag,3,BOOLEAN(TRUE))
  • Do not use keywords or function names as column names or output port names.
  • When you configure a Filter transformation and specify a simple filter condition on columns of date or timestamp in a Databricks table, you must pass the data through the TO_DATE() or TO_TIMESTAMP() function as an expression in the filter condition.
  • You cannot use nested aggregate functions in an Aggregator transformation. For example, you cannot specify SUM(AVG(col1)) in the aggregate function. Use nested aggregations by adding multiple Aggregator transformations in the mapping.
  • When you use the COUNT_MIN_SKETCH() function in an Aggregator or Expression transformation, ensure that the columns that you pass as epsilon and confidence arguments are of double data type.
    If the fields are not of double data type, use an Expression transformation to first convert the column data type to double, and then use the returned columns in a downstream Aggregator or Expression transformation.

0 COMMENTS

We’d like to hear from you!