Table of Contents

Search

  1. Preface
  2. Introduction to Data Engineering Streaming
  3. Data Engineering Streaming Administration
  4. Sources in a Streaming Mapping
  5. Targets in a Streaming Mapping
  6. Streaming Mappings
  7. Window Transformation
  8. Appendix A: Connections
  9. Appendix B: Monitoring REST API Reference
  10. Appendix C: Sample Files

Microsoft Azure Data Lake Storage Gen2 Data Object

Microsoft Azure Data Lake Storage Gen2 Data Object

A Microsoft Azure Data Lake Storage Gen2 data object is a physical data object that represents data in a Microsoft Azure Data Lake Gen2 Storage Gen2. After you create a Microsoft Azure Data Lake Storage Gen2 connection, create a Microsoft Azure Data Lake Storage Gen2 data object write operation to write to a Microsoft Azure Data Lake Gen2 Storage.
Azure Data Lake Storage Gen2 inherits capabilities of Azure Data Lake Storage Gen1, such as file system semantics, directory, and file level security and scale. These capabilities are combined with Azure Blob capabilities, such as low-cost, tiered storage, and high availability.
Azure Data Lake Storage Gen2 is built on Azure Blob storage. You can use Azure Databricks version 5.4 or Azure HDInsight 4.0 to access the data stored in Azure Data Lake Storage Gen2.
When you configure the data object operation properties, specify the format in which the data object writes data. You can specify JSON, Parquet, or Avro as format. When you specify JSON format, you must provide a sample file. When you specify Avro format, you must provide a sample Avro schema in a .avsc file.
You can pass any payload format directly from source to target in Streaming mappings. You can project columns in binary format pass a payload from source to target in its original form or to pass a payload format that is not supported.
Streaming mappings can read, process, and write hierarchical data. You can use array, struct, and map complex data types to process the hierarchical data. You assign complex data types to ports in a mapping to flow hierarchical data. Ports that flow hierarchical data are called complex ports.
For more information about processing hierarchical data, see the
Data Engineering Integration User Guide
.