Table of Contents

Search

  1. Preface
  2. Introduction to Data Engineering Streaming
  3. Data Engineering Streaming Administration
  4. Sources in a Streaming Mapping
  5. Targets in a Streaming Mapping
  6. Streaming Mappings
  7. Window Transformation
  8. Appendix A: Connections
  9. Appendix B: Monitoring REST API Reference
  10. Appendix C: Sample Files

Third-Party Applications

Third-Party Applications

Data Engineering Streaming uses third-party distributions to connect to a Spark engine on a Hadoop cluster or to a Databricks Spark engine on a Databricks cluster.
In a Hadoop environment, Data Engineering Streaming pushes job processing to the Spark engine. It uses YARN to manage the resources on a Spark cluster.
In a Databricks environment, Data Engineering Streaming pushes job processing to the Databricks cluster, and the Databricks Spark engine runs the job.