Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Microsoft Azure Blob Storage
  3. PowerExchange for Microsoft Azure Blob Storage Configuration
  4. Microsoft Azure Blob Storage Connections
  5. Microsoft Azure Blob Storage Data Objects
  6. Microsoft Azure Blob Storage Mappings
  7. Data Type Reference

PowerExchange for Microsoft Azure Blob Storage User Guide

PowerExchange for Microsoft Azure Blob Storage User Guide

Rules and Guidelines for Data Types

Rules and Guidelines for Data Types

Consider the following rules and guidelines for data types:
  • Avro data types support:
    • Date, Decimal, and Timestamp data types are applicable when you run a mapping in the native environment or on the Spark engine in Cloudera CDH 6.3 distribution.
    • Time data type is applicable when you run a mapping in the native environment in Cloudera CDH 6.3 distribution.
  • JSON data types support:
    • For PowerExchange for Microsoft Azure Data Lake Storage Gen2, you can read and write complex file objects in JSON format in mappings that run in the native environment, Spark engine, and Databricks Spark engine.
      For other file-based adapters, you can read and write complex file objects in JSON format in mappings that run on the Spark engine only.
  • Parquet data types support:
    • Before you create and run a new mapping on the Databricks engine to read a parquet file with hierarchical data types, you must set the
      -DINFA_HADOOP_DIST_DIR=hadoop\Databricks_7.2
      option in the
      developerCore.ini
      file.
    • When you set the
      -DINFA_HADOOP_DIST_DIR=hadoop\<Distro>
      option in the
      developerCore.ini
      file and import a Parquet file, the format of the imported metadata differs based on the distribution. For Cloudera CDP 7.1, the metadata is imported as string and for other supported distributions, the metadata is imported as UTF8.
    • Date, Time, and Timestamp data types till microseconds are applicable when you run a mapping in the native environment , Blaze, and Spark engine in the Hortonworks HDP 3.1, Azure HDInsight HDI 4.0, and Cloudera CDP 7.1 distributions.
    • Date, Time_Millis, and Timestamp_Millis data types are applicable when you run a mapping in the native environment or Spark engine in MapR 6.1.
    • Decimal data types are applicable when you run a mapping in the native environment and Spark engine in Cloudera CDH 6.3, Hortonworks HDP 3.1, Amazon EMR 5.20, MapR 6.1, and Azure HDInsight HDI 4.0 distributions.
    • Date, Time, Timestamp, and Decimal data types are applicable when you run a mapping on the Databricks Spark engine.
    • When you run a mapping and use Date data type that does not have a time value, the Data Integration Service adds the time value, based on the time zone, to the date in the target.
      For example, Date data type used in the source:
      1980-01-09
      Value generated in the target:
      1980-01-09 00:00:00
    • When you run a mapping in the native environment and use Time data type in the source, the Data Integration Service writes incorrect date value to the target.
      For example, Time data type used in the source:
      1980-01-09 06:56:01.365235000
      Incorrect Date value is generated in the target:
      1899-12-31 06:56:01.365235000
    • When you run a mapping in the native environment and use Date data type in the source, the Data Integration Service writes incorrect time value to the target.
      For example, Date data type used in the source:
      1980-01-09 00:00:00
      Incorrect Time value generated in the target:
      1980-01-09 05:30:00
  • To run a mapping that reads and writes Date, Time, Timestamp, and Decimal data types, update the
    -DINFA_HADOOP_DIST_DIR
    option to the
    developerCore.ini
    file. The
    developerCore.ini
    file is located in the following directory:
    <Client installation directory>\clients\DeveloperClient\
    Add the following path to the
    developerCore.ini
    file:
    -DINFA_HADOOP_DIST_DIR=hadoop\<Hadoop distribution>_<version>
    For example:
    -DINFA_HADOOP_DIST_DIR=hadoop\CDH_6.3
  • To use precision up to 38 digits for Decimal data type in the native environment, set the
    EnableSDKDecimal38
    custom property to
    true
    for the Data Integration Service. The
    EnableSDKDecimal38
    custom property is applicable to all file-based PowerExchange adapters except PowerExchange for HDFS.

0 COMMENTS

We’d like to hear from you!