Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Connections
  4. Mappings in the Hadoop Environment
  5. Mapping Objects in the Hadoop Environment
  6. Processing Hierarchical Data on the Spark Engine
  7. Stateful Computing on the Spark Engine
  8. Monitoring Mappings in the Hadoop Environment
  9. Mappings in the Native Environment
  10. Profiles
  11. Native Environment Optimization
  12. Data Type Reference
  13. Complex File Data Object Properties
  14. Function Reference
  15. Parameter Reference

Compression and Decompression for Complex File Sources and Targets

Compression and Decompression for Complex File Sources and Targets

You can read and write compressed complex files, specify compression formats, and decompress files. You can use compression formats such as Bzip2 and Lzo, or specify a custom compression format. The compressed files must be of the binary format.
You can compress sequence files at a record level or at a block level.
For information about how Hadoop processes compressed and uncompressed files, see the Hadoop documentation.
The following table describes the complex file compression formats for binary files:
Compression Options
Description
None
The file is not compressed.
Auto
The Data Integration Service detects the compression format of the file based on the file extension.
DEFLATE
The DEFLATE compression format that uses a combination of the LZ77 algorithm and Huffman coding.
Gzip
The GNU zip compression format that uses the DEFLATE algorithm.
Bzip2
The Bzip2 compression format that uses the Burrows–Wheeler algorithm.
Lzo
The Lzo compression format that uses the Lempel-Ziv-Oberhumer algorithm.
Snappy
The LZ77-type compression format with a fixed, byte-oriented encoding.
Custom
Custom compression format. If you select this option, you must specify the fully qualified class name implementing the
CompressionCodec
interface in the
Custom Compression Codec
field.


Updated November 09, 2018