Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for HDFS
  3. PowerExchange for HDFS Configuration
  4. HDFS Connections
  5. HDFS Data Objects
  6. HDFS Data Extraction
  7. HDFS Data Load
  8. HDFS Mappings
  9. Appendix A: Data Type Reference

PowerExchange for HDFS User Guide

PowerExchange for HDFS User Guide

HDFS or View File System (ViewFS) Data Load Mapping Example

HDFS or View File System (ViewFS) Data Load Mapping Example

Your organization needs to denormalize employee ID, name, and address details. The employee ID, name, and address details are stored in flat files in HDFS or ViewFS. Create a mapping that reads all the employee ID, name, and address details from the flat files in HDFS. The mapping must convert the denormalized data to hierarchical data and write it to a complex file target in HDFS or ViewFS.
You can use the target data for business analytics.
The following figure shows the example mapping:
The HDFS mapping example shows two flat file inputs, a data processor transformation, and a complex file output.
You can use the following objects in the HDFS or ViewFS mapping:
HDFS Inputs
The inputs, Read_Address_Flat_File and Read_Name_Flat_File, are flat files stored in HDFS or ViewFS.
Data Processor Transformation
The Data Processor transformation, JSON_R2H_Denormalize_NameAndAddress, reads the flat files, denormalizes the data, and provides a binary, hierarchical output.
HDFS Output
The output, Write_Complex_File, is a complex file stored in HDFS or ViewFS.
When you run the mapping, the Data Integration Service reads the input flat files and passes the employee ID, name, and address data to the Data Processor transformation. The Data Processor transformation denormalizes the employee ID, name, and address data, and provides a hierarchical output in a binary stream. The binary and hierarchical output is written to the HDFS or ViewFS complex file target.
You can configure the mapping to run in a native or Hadoop run-time environment.
Complete the following tasks to configure the mapping:
  1. Create an HDFS or ViewFS connection to read flat files from the Hadoop cluster.
  2. Specify the read properties for the flat files.
  3. Drag and drop the flat files into a mapping.
  4. Create a Data Processor transformation. Set the Data Processor transformation port to binary.
  5. Create an HDFS or ViewFS connection to write data to the complex file target.
  6. Create a complex file data object write operation. Specify the following parameters:
    • The file as the resource in the data object.
    • The HDFS or ViewFS file location.
  7. Drag and drop the complex file data object write operation into the mapping.

0 COMMENTS

We’d like to hear from you!