Table of Contents

Search

  1. Preface
  2. Introduction to Data Engineering Streaming
  3. Data Engineering Streaming Administration
  4. Sources in a Streaming Mapping
  5. Targets in a Streaming Mapping
  6. Streaming Mappings
  7. Window Transformation
  8. Appendix A: Connections
  9. Appendix B: Monitoring REST API Reference
  10. Appendix C: Sample Files

Advanced Properties

Advanced Properties

The Developer tool displays the advanced properties for complex file targets in the Input transformation in the
Write
view.
The following table describes the advanced properties that you configure for complex file targets in a streaming mapping:
Property
Description
Operation Type
Indicates the type of data object operation.
This is a read-only property.
File Directory
The location of the complex file target.
At run time, the Data Integration Service creates temporary directories in the specified file directory to manage the target files.
If the directory is in HDFS, enter the path without the node URI. For example,
/user/lib/testdir
specifies the location of a directory in HDFS. The path must be 512 characters or less.
Overwrite Target
Not applicable for streaming mappings.
File Name
The name of the output file. Spark appends the file name with a unique identifier before it writes the file to HDFS.
File Format
The file format. Select one of the following file formats:
  • Binary. Select Binary to read any file format.
  • Sequence. Select Sequence File Format for target files of a specific format that contain key and value pairs.
Output Format
The class name for files of the output format. If you select Output Format in the
File Format
field, you must specify the fully qualified class name implementing the
OutputFormat
interface.
Output Key Class
The class name for the output key. By default, the output key class is NullWritable.
Output Value Class
The class name for the output value. By default, the output value class is Text.
Compression Format
Optional. The compression format for binary files. Select one of the following options:
  • None
  • Auto
  • DEFLATE
  • gzip
  • bzip2
  • LZO
  • Snappy
  • Custom
Custom Compression Codec
Required for custom compression. Specify the fully qualified class name implementing the
CompressionCodec
interface.
Sequence File Compression Type
Optional. The compression format for sequence files. Select one of the following options:
  • None
  • Record
  • Block
Stream Rollover Size in GB
Optional. Target file size, in gigabytes (GB), at which to trigger rollover. A value of zero (0) means that the target file does not roll over based on size. Default is 1 GB.
Stream Rollover Time in Hours
Optional. Length of time, in hours, for a target file to roll over. After the time period has elapsed, the target file rolls over. A value of zero (0) means that the target file does not roll over based on time. Default is 1 Hour.
Schema Location
Optional. The schema location to fetch the schema in a streaming mapping.
Only Avro schema using binary file format is supported. You must disable the column projection.
If you select
External Location
for the dynamic schema strategy, you must create a
writer.avsc
file having the schema content at the schema location and keep it under the topic name. For example:
<Schema Location>/<Topic Name>/writer.avsc
. Then, specify only the path till the schema location.
Schema location must be named as per the topic name.
If
Dynamic Schema Strategy
is enabled at source and schema location is not provided for the HDFS target, then at runtime schema location fetches the schema for the HDFS target from the source transformation schema.
Interim Directory
The active directory location of the complex file target. This directory stages all the files currently in open state. When the stream rollover condition is met, the files are moved from the interim directory to the target directory.