Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Parquet Data Types and Transformation Data Types

Parquet Data Types and Transformation Data Types

Parquet data types map to transformation data types that the Data Integration Service uses to move data across platforms.
The following table compares the Parquet data types that the Data Integration Service supports and the corresponding transformation data types:
Parquet
Transformation
Range
Binary
Binary
1 to 104,857,600 bytes
Binary (UTF8)
String
1 to 104,857,600 characters
Boolean
Integer
-2,147,483,648 to 2,147,483,647
Precision of 10, scale of 0
Date
Date/Time
January 1, 0001 to December 31, 9999.
Decimal
Decimal
Decimal value with declared precision and scale. Scale must be less than or equal to precision.
For transformations that support precision up to 38 digits, the precision is 1 to 38 digits, and the scale is 0 to 38.
For transformations that support precision up to 28 digits, the precision is 1 to 28 digits, and the scale is 0 to 28.
If you specify the precision greater than the maximum number of digits, the Data Integration Service converts decimal values to double in high precision mode.
Double
Double
Precision of 15 digits.
Float
Double
Precision of 15 digits.
Int32
Integer
-2,147,483,648 to 2,147,483,647
Precision of 10, scale of 0
Int64
Bigint
-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807
Precision of 19, scale of 0
Map
Map
Unlimited number of characters.
Struct
Struct
Unlimited number of characters.
Time
Date/Time
Time of the day. Precision to microsecond.
Timestamp
Date/Time
January 1, 0001 00:00:00 to December 31, 9999 23:59:59.997.
Precision to microsecond.
group (LIST)
Array
Unlimited number of characters.
The Parquet schema that you specify to read or write a Parquet file must be in smaller case. Parquet does not support case-sensitive schema.

Parquet Timestamp Data Type Support

The following table lists the Timestamp data type support for Parquet file formats:
Timestamp Data type
Native
Spark
Timestamp_micros
Yes
Yes
Timestamp_millis
Yes
No
Time_millis
Yes
No
Time_micros
Yes
No
int96
Yes
Yes

Unsupported Parquet Data Types

The Developer tool does not support the following Parquet data types:
  • Timestamp_nanos
  • Time_nanos
  • Timestamp_tz

0 COMMENTS

We’d like to hear from you!