Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Rules and Guidelines for Data Types

Rules and Guidelines for Data Types

Consider the following rules and guidelines for data types:
  • Avro data types support:
    • Date, Decimal, and Timestamp data types are applicable when you run a mapping in the native environment or on the Spark engine in Cloudera CDH 6.3 distribution.
    • Time data type is applicable when you run a mapping in the native environment in Cloudera CDH 6.3 distribution.
  • JSON data types support:
    • For PowerExchange for Microsoft Azure Data Lake Storage Gen2, you can read and write complex file objects in JSON format in mappings that run in the native environment, Spark engine, and Databricks Spark engine.
      For other file-based adapters, you can read and write complex file objects in JSON format in mappings that run on the Spark engine only.
  • Parquet data types support:
    • Before you create and run a new mapping on the Databricks engine to read a parquet file with string data type or hierarchical data types, you must set the
      -DINFA_HADOOP_DIST_DIR=hadoop\Databricks_7.2
      option in the
      developerCore.ini
      file.
    • When you set the
      -DINFA_HADOOP_DIST_DIR=hadoop\<Distro>
      option in the
      developerCore.ini
      file and import a Parquet file, the format of the imported metadata differs based on the distribution. For Cloudera CDP 7.1, the metadata is imported as string and for other supported distributions, the metadata is imported as UTF8.
    • Date, Time, and Timestamp data types till microseconds are applicable when you run a mapping in the native environment , Blaze, and Spark engine in the Hortonworks HDP 3.1, Azure HDInsight HDI 4.0, and Cloudera CDP 7.1 distributions.
    • Date, Time_Millis, and Timestamp_Millis data types are applicable when you run a mapping in the native environment or Spark engine in MapR 6.1.
    • Decimal data types are applicable when you run a mapping in the native environment and Spark engine in Cloudera CDH 6.3, Hortonworks HDP 3.1, Amazon EMR 5.20, MapR 6.1, and Azure HDInsight HDI 4.0 distributions.
    • Date, Time, Timestamp, and Decimal data types are applicable when you run a mapping on the Databricks Spark engine.
    • When you run a mapping and use Date data type that does not have a time value, the Data Integration Service adds the time value, based on the time zone, to the date in the target.
      For example, Date data type used in the source:
      1980-01-09
      Value generated in the target:
      1980-01-09 00:00:00
    • When you run a mapping in the native environment and use Time data type in the source, the Data Integration Service writes incorrect date value to the target.
      For example, Time data type used in the source:
      1980-01-09 06:56:01.365235000
      Incorrect Date value is generated in the target:
      1899-12-31 06:56:01.365235000
    • When you run a mapping in the native environment and use Date data type in the source, the Data Integration Service writes incorrect time value to the target.
      For example, Date data type used in the source:
      1980-01-09 00:00:00
      Incorrect Time value generated in the target:
      1980-01-09 05:30:00
  • To run a mapping that reads and writes Date, Time, Timestamp, and Decimal data types, update the
    -DINFA_HADOOP_DIST_DIR
    option to the
    developerCore.ini
    file. The
    developerCore.ini
    file is located in the following directory:
    <Client installation directory>\clients\DeveloperClient\
    Add the following path to the
    developerCore.ini
    file:
    -DINFA_HADOOP_DIST_DIR=hadoop\<Hadoop distribution>_<version>
    For example:
    -DINFA_HADOOP_DIST_DIR=hadoop\CDH_6.3
  • To use precision up to 38 digits for Decimal data type in the native environment, set the
    EnableSDKDecimal38
    custom property to
    true
    for the Data Integration Service. The
    EnableSDKDecimal38
    custom property is applicable to all file-based PowerExchange adapters except PowerExchange for HDFS.

0 COMMENTS

We’d like to hear from you!