Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Flat File Targets on Hadoop

Flat File Targets on Hadoop

Consider the following rules and guidelines for flat file targets in the Hadoop environment:
  • A mapping that runs in the Hadoop environment can write to a flat file target in the native environment.
  • The Data Integration Service truncates the target files and reject files before writing the data. When you use a flat file target, you cannot append output data to target files and reject files.
  • The Data Integration Service can write to a file output for a flat file target. When you have a flat file target in a mapping, you cannot write data to a command.
Consider the following rules and guidelines for HDFS flat file targets:
  • The Data Integration Service truncates the target files and reject files before writing the data. To append output data to HDFS target files and reject files, choose to append data if the HDFS target exists.
    Data is appended to reject files only if the reject file directory is on the Data Integration Service machine. If the directory is in the Hadoop environment, rejected rows are overwritten.
  • When you choose to append data if the HDFS target exists, the Data Integration Service appends the mapping execution ID to the names of the target files and reject files.
  • When you use a HDFS flat file target in a mapping, you must specify the full path that includes the output file directory and file name. The Data Integration Service might generate multiple output files in the output directory when you run the mapping in a Hadoop environment.
  • An HDFS target cannot reside on a remote cluster. A remote cluster is a cluster that is remote from the machine that the Hadoop connection references in the mapping.


Updated September 24, 2020