PowerExchange for HDFS User Guide

10.5
- 10.5.9
- 10.5.8
- 10.5.7
- 10.5.6
- 10.5.4
- 10.5.3
- 10.5.1
- 10.4.1
- 10.4.0

Back Next

Compression and Decompression for Complex File Sources and Targets

You can read and write compressed complex files, specify compression formats, and decompress files. You can use compression formats such as Bzip2 and Lzo, or specify a custom compression format. The compressed files must be of the binary format.

You can compress sequence files at a record level or at a block level.

For information about how Hadoop processes compressed and uncompressed files, see the Hadoop documentation.

The following table describes the complex file compression formats for binary files:

Compression Options	Description
None	The file is not compressed.
Auto	The Data Integration Service detects the compression format of the file based on the file extension.
DEFLATE	The DEFLATE compression format that uses a combination of the LZ77 algorithm and Huffman coding.
Gzip	The GNU zip compression format that uses the DEFLATE algorithm.
Bzip2	The Bzip2 compression format that uses the Burrows–Wheeler algorithm.
Lzo	The Lzo compression format that uses the Lempel-Ziv-Oberhumer algorithm.
Snappy	The LZ77-type compression format with a fixed, byte-oriented encoding. Default compression format is Snappy on the Spark engine.
Custom	Custom compression format. If you select this option, you must specify the fully qualified class name implementing the CompressionCodec interface in the Custom Compression Codec field.

Rename Saved Search

Table of Contents

PowerExchange for HDFS User Guide

PowerExchange for HDFS User Guide

Compression and Decompression for Complex File Sources and Targets

Compression and Decompression for Complex File Sources and Targets