PowerExchange for Hadoop User Guide for PowerCenter

PowerExchange for Hadoop User Guide for PowerCenter

Session Properties for a Hadoop Target

Session Properties for a Hadoop Target

You can configure a session for a Hadoop target to load data to HDFS or a Hive table. When you load data to a Hadoop target, you can set partitioning properties and file paths. When you load data to HDFS, you can also set header options.
The following table describes the properties that you can configure for a Hadoop target:
Session Property
Description
Merge Type
Type of merge that the Integration Service performs on the data for partitioned targets.
You can choose one of the following merge types:
  • No Merge
  • Sequential Merge
Append if Exists
Appends data to a file.
If the merge file path refers to a directory, this option applies to the files in the directory.
Header Options
Creates a header row in the flat file when loading data to HDFS.
Auto generate partition file names
Generates partition file names.
Merge File Path
If you choose a sequential merge type, defines the final Hadoop target location where the Integration Service creates the merge file. The Integration Service creates the merge file from the intermediate merge file output in the location defined in the output file path.
Generate And Load Hive Table
Generates a relational table in the Hive database. The Integration Service loads data into the Hive table from the HDFS flat file target.
Overwrite Hive Table
Overwrites the data in the Hive table.
Hive Table Name
Hive table name. Default is the target instance name.
Externally Managed Hive Table
Loads Hive table data to the location defined in the output file path.
Output File Path
Defines the absolute or relative directory path or file path on the HDFS host where the Integration Service writes the HDFS data. A relative path is relative to the home directory of the Hadoop user.
If you choose a sequential merge type, defines where the Integration Service writes intermediate output before it writes to the final Hadoop target location as defined by the merge file path.
If you choose to generate partition file names, this path can be a directory path.
Reject File Path
The path to the reject file. By default, the Integration Service writes all reject files to service process variable directory, $PMBadFileDir.

0 COMMENTS

We’d like to hear from you!