Table of Contents

Search

  1. Preface
  2. Introduction to Data Engineering Streaming
  3. Data Engineering Streaming Administration
  4. Sources in a Streaming Mapping
  5. Targets in a Streaming Mapping
  6. Streaming Mappings
  7. Window Transformation
  8. Appendix A: Connections
  9. Appendix B: Monitoring REST API Reference
  10. Appendix C: Sample Files

Targets in a Streaming Mapping on Databricks

Targets in a Streaming Mapping on Databricks

A streaming mapping that runs in the Databricks environment can include file and streaming targets.
Based on the type of target you write to, you can create the following data objects:
Amazon Kinesis
A physical data object that represents data in an Amazon Kinesis Data Firehose Delivery Stream. After you create an Amazon Kinesis connection, create an Amazon Kinesis data object to write to Amazon Kinesis Data Firehose.
Amazon S3
A physical data object that represents data in an Amazon S3 resource. After you configure an Amazon S3 connection, create an Amazon S3 data object to write to Amazon S3 targets.
In a Databricks environment, you cannot use CSV and Parquet payloads in a streaming mapping with Amazon S3 targets.
Azure Event Hubs
A physical data object that represents data in Microsoft Azure Event Hubs data streaming platform and event ingestion service. Create an Azure Even Hub data object to connect to an Event Hub target.
Confluent Kafka
A physical data object that represents data in a Kafka stream or a Confluent Kafka stream. After you configure a Messaging connection, create a Confluent Kafka data object to write data to Kafka brokers or Confluent Kafka brokers using schema registry.
Databricks Delta Lake
A Databricks Delta Lake is an open source storage layer that provides ACID transactions and works on top of existing data lakes. Create a relational data object to write to a Databricks Delta Lake target.
Kafka
A physical data object that represents data in a Kafka stream. After you configure a Messaging connection, create a Kafka data object to write to Apache Kafka brokers.
Microsoft Azure Data Lake Storage Gen2
A Microsoft Azure Data Lake Storage Gen2 data object is a physical data object that represents a Microsoft Azure Data Lake Storage Gen2 table. Create a Microsoft Azure Data Lake Storage Gen2 data object to write to a Microsoft Azure Data Lake Storage Gen2 table.
You can run streaming mappings in AWS Databricks service in AWS cloud ecosystems or in Azure Databricks service in Microsoft Azure cloud services. The following table shows the list of targets that you can include in a streaming mapping based on the Databricks service:
Targets
Services
Amazon Kinesis
AWS
Amazon S3
AWS
Azure Event Hubs
Azure
Confluent Kafka
AWS, Azure
Databricks Delta Lake
AWS, Azure
Kafka
AWS, Azure
Microsoft Azure Data Lake Storage Gen2
Azure