Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for HDFS
  3. PowerExchange for HDFS Configuration
  4. HDFS Connections
  5. HDFS Data Objects
  6. HDFS Data Extraction
  7. HDFS Data Load
  8. HDFS Mappings
  9. Appendix A: Data Type Reference

PowerExchange for HDFS User Guide

PowerExchange for HDFS User Guide

Generate the Source File Name for HDFS Data Objects

Generate the Source File Name for HDFS Data Objects

You can add a file name column to the flat file data object. The file name column helps you to identify the source file that contains a particular record of data. You can configure the mapping with the file name column for both flat file and complex file data objects. When you read data from HDFS, you can extract the fully qualified path of the source file.
You can configure the mapping to write the source file name to each source row when you add a File Name Column port in the Overview view. The File Name Column port contains the name and the fully qualified path for each source file. The File Name Column port is a string port with a default precision of 256 characters.
If the file or directory is in HDFS, enter the path without the node URI. For example,
/user/lib/testdir
specifies the location of a directory in HDFS. The path must not contain more than 512 characters.
When you use a file name column in a Read transformation, the file name column returns the value in the following format for HDFS:
hdfs://<host name>:<port>/<file name path>
For example, the file name column returns
hdfs://irldv:5008/hive/warehouse/ff.txt
, where the host name is irldv and the port is 5008.

0 COMMENTS

We’d like to hear from you!