Generate the Source File Name for HDFS or View File System (ViewFS) Data Objects
Generate the Source File Name for HDFS or View File System (ViewFS) Data Objects
You can add a file name column to the flat file data object. The file name column helps you to identify the source file that contains a particular record of data. You can configure the mapping with the file name column for both flat file and complex file data objects. When you read data from HDFS or ViewFS, you can extract the fully qualified path of the source file.
You can configure the mapping to write the source file name to each source row when you add a File Name Column port in the Overview view. The File Name Column port contains the name and the fully qualified path for each source file. The File Name Column port is a string port with a default precision of 256 characters.
If the file or directory is in HDFS or ViewFS, enter the path without the node URI. For example,
/user/lib/testdir
specifies the location of a directory in HDFS or ViewFS. The path must not contain more than 512 characters.
When you use a file name column in a Read transformation, the file name column returns the value in the following format for HDFS:
hdfs://<host name>:<port>/<file name path>
For example, the file name column returns
hdfs://irldv:5008/hive/warehouse/ff.txt
, where the host name is irldv and the port is 5008.