Search

PowerExchange for HDFS User Guide

PowerExchange for HDFS User Guide

10.4.1
- 10.5.9
- 10.5.8
- 10.5.7
- 10.5.6
- 10.5.4
- 10.5.3
- 10.5.1
- 10.5
- 10.4.0

Back Next

Advanced Properties

Advanced Properties

The Developer tool displays the advanced properties for complex file targets in the Input transformation in the

Write

view.

The following table describes the advanced properties that you configure for complex file targets:

Property	Description
File Directory	The directory location of the complex file target. If the directory is in HDFS, enter the path without the node URI. For example, /user/lib/testdir specifies the location of a directory in HDFS. The path must not contain more than 512 characters. If the directory is in the local system, enter the fully qualified path. For example, /user/testdir specifies the location of a directory in the local system. The Data Integration Service ignores any subdirectories and their contents.
File Name	The name of the output file. PowerExchange for HDFS appends the file name with a unique identifier before it writes the file to HDFS. In spark mode PowerExchange for HDFS appends the file name with .avro extension.
Overwrite Target	Indicates whether the Data Integration Service must first delete the target data before writing data. If you select the Overwrite Target option, the Data Integration Service deletes the target data before writing data. If you do not select this option, the Data Integration Service creates a new file in the target and writes the data to the file. This option is applicable when you run a mapping in the native environment or on the Spark engine to write data to complex files.
File Format	The file format. Select one of the following file formats: Binary. Select Binary to write any file format. Sequence. Select Sequence File Format for target files of a Hadoop-specific binary format that contain key and value pairs. Custom Output. Select Output Format to specify a custom output format. You must specify the class name implementing the OutputFormat interface in the Output Format field. Assign Parameter. Select Assign Parameter to parameterize the file format. Default is Binary.
Output Format	The class name for files of the output format. If you select Output Format in the File Format field, you must specify the fully qualified class name implementing the OutputFormat interface.
Output Key Class	The class name for the output key. If you select Output Format in the File Format field, you must specify the fully qualified class name for the output key. You can specify one of the following output key classes: BytesWritable Text LongWritable IntWritable PowerExchange for HDFS generates the key in ascending order.
Output Value Class	The class name for the output value. If you select Output Format in the File Format field, you must specify the fully qualified class name for the output value. You can use any custom writable class that Hadoop supports. Determine the output value class based on the type of data that you want to write. When you use custom output formats, the value part of the data that is streamed to the complex file data object write operation must be in a serialized form.
Compression Format	Optional. The compression format for binary files. Select one of the following options: None Auto DEFLATE gzip bzip2 LZO Snappy Custom Assign Parameter...
Custom Compression Codec	Required for custom compression. Specify the fully qualified class name implementing the CompressionCodec interface.
Sequence File Compression Type	Optional. The compression format for sequence files. Select one of the following options: None Record Block Assign Parameter...

Complex File Data Object Write Properties

Watch

Comments

0 COMMENTS

We’d like to hear from you! Log in to comment.