Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for HDFS
  3. PowerExchange for HDFS Configuration
  4. HDFS Connections
  5. HDFS Data Objects
  6. HDFS Data Extraction
  7. HDFS Data Load
  8. HDFS Mappings
  9. Appendix A: Data Type Reference

PowerExchange for HDFS User Guide

PowerExchange for HDFS User Guide

HDFS Data Load Mapping Example

HDFS Data Load Mapping Example

Your organization needs to denormalize employee ID, name, and address details. The employee ID, name, and address details are stored in flat files in HDFS. Create a mapping that reads all the employee ID, name, and address details from the flat files in HDFS. The mapping must convert the denormalized data to hierarchical data and write it to a complex file target in HDFS.
You can use the target data for business analytics.
The following figure shows the example mapping:
The HDFS mapping example shows two flat file inputs, a data processor transformation, and a complex file output.
You can use the following objects in the HDFS mapping:
HDFS Inputs
The inputs, Read_Address_Flat_File and Read_Name_Flat_File, are flat files stored in HDFS.
Data Processor Transformation
The Data Processor transformation, JSON_R2H_Denormalize_NameAndAddress, reads the flat files, denormalizes the data, and provides a binary, hierarchical output.
HDFS Output
The output, Write_Complex_File, is a complex file stored in HDFS.
When you run the mapping, the Data Integration Service reads the input flat files and passes the employee ID, name, and address data to the Data Processor transformation. The Data Processor transformation denormalizes the employee ID, name, and address data, and provides a hierarchical output in a binary stream. The binary and hierarchical output is written to the HDFS complex file target.
You can configure the mapping to run in a native or Hadoop run-time environment.
Complete the following tasks to configure the mapping:
  1. Create an HDFS connection to read flat files from the Hadoop cluster.
  2. Specify the read properties for the flat files.
  3. Drag and drop the flat files into a mapping.
  4. Create a Data Processor transformation. Set the Data Processor transformation port to binary.
  5. Create an HDFS connection to write data to the complex file target.
  6. Create a complex file data object write operation. Specify the following parameters:
    • The file as the resource in the data object.
    • The HDFS file location.
  7. Drag and drop the complex file data object write operation into the mapping.

0 COMMENTS

We’d like to hear from you!