Complex Data Flow with Data Duplication and Load Balancing
Complex Data Flow with Data Duplication and Load Balancing
A data flow can contain target services running on a single EDS Node or on multiple
EDS
Nodes. In such a deployment,
EDS
sends all the messages that the source publishes to each standalone target service. For a target service that you deploy on multiple nodes,
EDS
uses the round-robin method to distribute a subset of the messages to each target service instance.
The following image shows how
EDS
performs load balancing across multiple instances of a target service while duplicating data flows across different targets:
The following process describes how EDS balances the load:
A file source service FileSourceSvc publish messages on a topic called logs.
A RulePoint target service named RulePointSvc, a standalone Cassandra target service named CassandraSvc, and three instances of an HDFS target service named HDFSSvc receive those messages.
You deploy HDFSSvc on three EDS Nodes for purposes of load balancing.
EDS
distributes messages so that CassandraSvc and RulePointSvc receive all the messages that the source service publishes.
Simultaneously,
EDS
performs load balancing across the three instances of HDFSClient in round-robin fashion.