Table of Contents

Search

  1. Preface
  2. Introduction to Data Engineering Streaming
  3. Data Engineering Streaming Administration
  4. Sources in a Streaming Mapping
  5. Targets in a Streaming Mapping
  6. Streaming Mappings
  7. Transformation in Streaming Mappings
  8. Window Transformation
  9. Appendix A: Connections
  10. Appendix B: Monitoring REST API Reference
  11. Appendix C: Sample Files

FileName Port in ADLS Gen2

FileName Port in ADLS Gen2

When you create a data object read or write operation for Microsoft Azure Data Lake Store Gen2 (ADLS Gen2) files, the FileName port appears by default.
When the Spark engine writes to ADLS Gen2 files using a FileName port, it uses the following process to write the data:
  1. The Data Integration Service creates separate directories for each value in the FileName port and adds the target files within the directories.
  2. The file rollover process closes the current file to which data is being written to and creates a new file based on the configured rollover value.
    Effective in 10.4.1, you can configure the following rollover parameters at the design-time based on time and size:
    • Stream Rollover Time in Hours. Specify the rollover time in hours for a target file when a certain period of time has elapsed.
    • Stream Rollover Size in GB. Specify the size in GB for a target file when the target file reaches a certain size.
  3. When a target file reaches the configured rollover value, the target file is rolled over and moved to the specified ADLS Gen2 target location.
  4. The Spark engine creates sub-directories in the specified ADLS Gen2 target location for each value in the FileName port.
  5. The Spark engine moves the rolled over target files to the sub-directories created for each value in the FileName port in the ADLS Gen2 target location.
You can configure both rollover schemes for an ADLS Gen2 target file. The Spark engine rolls over to based on the first event that triggers. For example, if you configure rollover time to 1 hour and rollover size to 1 GB, the target service rolls the file over when the file reaches a size of 1 GB even if the 1 hour period has not elapsed.

0 COMMENTS

We’d like to hear from you!