Mass Ingestion Guide

10.4.0

Back Next

HDFS Target

Configure an HDFS target to ingest source data to a flat file on HDFS.

When you configure the mass ingestion specification to ingest data to an HDFS target, you configure an HDFS connection and an ingestion directory to define the target.

If you enable incremental load in the definition of the mass ingestion specification, you must configure incremental load options for the HDFS target to select a mode to ingest the data.

The following image shows the Target page for an HDFS target:

This screenshot shows the Target page of the mass ingestion specification for an HDFS target. On the Target page, you can configure properties to define the HDFS target. The bottom of the page shows a section for Incremental Load Options. In the top-right corner, you have the option Next to go to the next page, or the button X to discard the specification.

The following table describes the properties that you can configure to define the HDFS target:

Property	Description
Target Connection	Required. The HDFS connection used to find the HDFS storage target. If changes are made to the available HDFS connections, refresh the browser or log out and log back in to the Mass Ingestion tool.
Target Table Prefix	The prefix added to the names of the target files. Enter a string. You can enter alphanumeric and underscore characters. The prefix is not case sensitive.
Target Table Suffix	The suffix added to the names of the target files. Enter a string. You can enter alphanumeric and underscore characters. The prefix is not case sensitive.
Ingestion Directory	Required. The target directory on HDFS. A sub-directory is created under the ingestion directory for each source that is ingested. If the specified directory already exists, the directory is replaced. For example, you can enter /temp . A source table named PRODUCT is ingested to the directory /temp/PRODUCT/ .
Compression	Required. The compressed file format that stores the target files. You can select None, Gzip, Bzip2, LZO, Snappy, or Custom. If you select Custom, enter the compression codec. Default is None.
Compression Codec	If you select custom compression, enter the fully qualified class name implementing the Hadoop CompressionCodec interface.
Delimiters	The delimiters used to separate data in the target files. You can select comma, semicolon, space, tab, or other. If you select Other, you can define a custom delimiter.
Other Delimiter	Required if you choose Other for the delimiter. Enter a custom delimiter.
Mode	Required if you enable incremental load. Select Append or Overwrite. Append mode appends the incremental data to the target. Overwrite mode overwrites the data in the target with the incremental data. Default is Append.

When the Data Integration Service stores temporary files that you ingest to an HDFS target, it appends a unique ID to the original file name. The resulting file name can have a maximum length of 255 characters.

Rename Saved Search

Table of Contents

Mass Ingestion Guide

Mass Ingestion Guide

HDFS Target

HDFS Target