Table of Contents

Search

  1. Preface
  2. Introduction to Data Engineering Streaming
  3. Data Engineering Streaming Administration
  4. Sources in a Streaming Mapping
  5. Targets in a Streaming Mapping
  6. Streaming Mappings
  7. Window Transformation
  8. Appendix A: Connections
  9. Appendix B: Monitoring REST API Reference
  10. Appendix C: Sample Files

FileName Port in Complex File

FileName Port in Complex File

When you create a data object read or write operation for HDFS complex files, the FileName port appears by default.
When the Spark engine writes to HDFS complex files using a FileName port, it uses the following process to write the data:
  1. At run time, the Data Integration Service creates separate directories for each value in the FileName port and adds the target files within the directories.
  2. The file rollover process closes the current file to which data is being written to and creates a new file based on the configured rollover value.
    you can configure the following optional execution rollover parameters at the design-time based on time and size:
    • Stream Rollover Time in Hours. Specify the rollover time in hours for a target file when a certain period of time has elapsed.
    • Stream Rollover Size in GB. Specify the size in GB for a target file when the target file reaches a certain size.
  3. When a target file reaches the configured rollover value, the target file is rolled over and moved to the specified HDFS Complex File target location.
  4. The Spark engine creates sub-directories in the specified HDFS comples file target location for each value in the FileName port.
  5. The Spark engine moves the rolled over target files to the sub-directories created for each value in the FileName port in the specified Complex file target location.
You can configure both rollover schemes for an HDFS complex file target. The Spark engine rolls over to based on the first event that triggers. For example, if you configure rollover time to 1 hour and rollover size to 1 GB, the target service rolls the file over when the file reaches a size of 1 GB even if the 1 hour period has not elapsed.

0 COMMENTS

We’d like to hear from you!