When a domain contains multiple nodes, the nodes are resilient to temporary failures in communication from other nodes in the domain.
Nodes are resilient to the following temporary connection failures:
A non-master gateway node becomes unavailable.
Every node in the domain sends a communication signal to the master gateway node at periodic intervals of 15 seconds. For nodes with the service role, the communication includes a list of application services running on the node.
All nodes have a resilience timeout of 90 seconds. If a node fails to connect to the master gateway node within the resilience timeout period, the master gateway node marks the node unavailable. If the node that fails to connect has the service role, the master gateway node also reassigns its application services to a back-up node. This ensures that services on a node continue to run despite node failures.
The master gateway node becomes unavailable.
You can configure more than one node to serve as a gateway. If the master gateway node becomes unavailable, the Service Managers on the other gateway nodes elect another master gateway node.
If you configure one node to serve as the gateway and the node becomes unavailable, all other nodes shut down.