Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Connections
  4. Mappings in the Hadoop Environment
  5. Mapping Objects in the Hadoop Environment
  6. Processing Hierarchical Data on the Spark Engine
  7. Stateful Computing on the Spark Engine
  8. Monitoring Mappings in the Hadoop Environment
  9. Mappings in the Native Environment
  10. Profiles
  11. Native Environment Optimization
  12. Data Type Reference
  13. Complex File Data Object Properties
  14. Function Reference
  15. Parameter Reference

Complex File Sources

Complex File Sources

A mapping that runs in the Hadoop environment can process complex files.
You can read files from the local file system or from HDFS. To read large volumes of data, you can connect a complex file source to read data from a directory of files that have the same format and properties. You can read compressed binary files.
A mapping that runs on the Blaze engine or the Hive engine can contain a Data Processor transformation. You can include a complex file reader object without a Data Processor transformation to read complex files that are flat files. If the complex file is a hierarchical file, you must connect the complex file reader object to a Data Processor transformation.
A mapping that runs on the Spark engine can process hierarchical data through complex data types. Use a complex file data object that represents the complex files in the Hadoop Distributed File System. If the complex file contains hierarchical data, you must enable the read operation to project columns as complex data types.
The following table shows the complex files that a mapping can process in the Hadoop environment:
File Type
Format
Blaze Engine
Spark Engine
Hive Engine
Avro
Flat
Supported
Supported
Supported
Avro
Hierarchical
Supported*
Supported**
Supported*
JSON
Flat
Supported*
Supported
Supported*
JSON
Hierarchical
Supported*
Supported**
Supported*
ORC
Flat
Not supported
Supported
Not supported
ORC
Hierarchical
Not supported
Not supported
Not supported
Parquet
Flat
Supported
Supported
Supported
Parquet
Hierarchical
Supported*
Supported**
Supported*
XML
Flat
Supported*
Not supported
Supported*
XML
Hierarchical
Supported*
Not supported
Supported*
* The complex file reader object must be connected to a Data Processor transformation.
** The complex file reader object must be enabled to project columns as complex data type.


Updated November 09, 2018