Complex Data Flow with Data Duplication and Load Balancing
A data flow can contain target services running on a single EDS Node or on multiple
EDS Nodes. In such a deployment,
EDS sends all the messages that the source publishes to each standalone target service. For a target service that you deploy on multiple nodes,
EDS uses the round-robin method to distribute a subset of the messages to each target service instance.
The following image shows how
EDS performs load balancing across multiple instances of a target service while duplicating data flows across different targets:
The following process describes how EDS balances the load:
A file source service FileSourceSvc publish messages on a topic called logs.
A RulePoint target service named RulePointSvc, a standalone Cassandra target service named CassandraSvc, and three instances of an HDFS target service named HDFSSvc receive those messages.
You deploy HDFSSvc on three EDS Nodes for purposes of load balancing.
EDS distributes messages so that CassandraSvc and RulePointSvc receive all the messages that the source service publishes.
EDS performs load balancing across the three instances of HDFSClient in round-robin fashion.
EDS delivers a message to one instance of HDFSClient.