Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for HDFS
  3. PowerExchange for HDFS Configuration
  4. HDFS Connections
  5. HDFS Data Objects
  6. HDFS Data Extraction
  7. HDFS Data Load
  8. HDFS Mappings
  9. Appendix A: Data Type Reference

PowerExchange for HDFS User Guide

PowerExchange for HDFS User Guide

Complex File Streaming

Complex File Streaming

To write data to a complex file, include a Data Processor transformation in the mapping to convert the source data into a binary format. You can use the binary stream to write data to the complex file.
The Data Processor transformation continually streams and sends input to the complex file target. It sends end of file information after it fully streams a file. It sends end of streaming information when it streams the entire input fully.
When the Data Processor transformation sends portions of the input to the complex file target, PowerExchange for HDFS appends unique identifier information to the file name. The Data Integration Service uses the unique identifiers to recognize that the streaming is in progress and not complete. Therefore, the file name that you specify in the complex file write properties is not the same as the output file in HDFS. The output file name in HDFS contains the unique identifier information as well.
The unique identifier format depends on whether the file is not compressed or not. The following table describes the unique identifier format based on whether the file is compressed or not:
Run-time Environment Type
File Type
Unique Identifier Format
Native
Uncompressed File
<filename>_<unique identifier>_<seq>.<ext>
Native
Compressed File
<filename>_<unique identifier>_<seq>.<compression format extension>
If you do not include the compression format extension as part of the file name in the complex file write properties, PowerExchange for HDFS appends extensions based on the compression format.
The following table describes the extensions that PowerExchange for HDFS appends based on the compression format that you use:
Compression Format
File Name Extension that PowerExchange for HDFS Appends
DEFLATE
.deflate
Gzip
.gz
Bzip2
.bz2
Lzo
.lzo
Snappy
.snz

0 COMMENTS

We’d like to hear from you!