Table of Contents


  1. Preface
  2. Advanced clusters
  3. Setting up AWS
  4. Setting up Google Cloud
  5. Setting up Microsoft Azure
  6. Setting up a self-service cluster
  7. Setting up a local cluster
  8. Advanced configurations
  9. Troubleshooting
  10. Appendix A: Command reference

Advanced Clusters

Advanced Clusters

NSG for master nodes

NSG for master nodes

Before you create your own custom NSG, it is helpful to understand the inbound and outbound rules.
The following image shows the default master node NSG:
 The inbound and outbound rules for the default network security group for the master node.

Inbound rules

The following table describes the inbound rules for the NSG:
SSH access
This rule has the IP address of the Secure Agent machine as the source. By default, SSH access is through port 22.
Apache Livy server access
This rule has the IP address of the Secure Agent machine as the source. By default, the Livy server access rule uses TCP port 31447. Data preview uses this rule.
Kubernetes API Server access
The Secure Agent uses this rule to access the Kubernetes API server to perform tasks such as deploying and monitoring Kubernetes applications and monitoring cluster resources.
Any Kubernetes client that is external to the advanced cluster also needs this rule to use the advanced cluster.
Other default inbound rules
The following default inbound rules also apply:
  • Intra-VNet communication. Allows worker nodes to communicate with master nodes.
  • Inbound traffic from the load balancer for distributing Kubernetes requests to the master node.

Outbound rules

Outbound rules allow outbound traffic to any nodes in the same VNet and to the internet. Data Integration needs access to various Azure services to support certain deployments.
Instead of using outbound rules to restrict outbound traffic to the internet, you can define firewall policies to validate outbound traffic. You can associate the subnet in which the advanced cluster is configured with a route table that routes all traffic to a firewall.
Using firewall policies is more flexible because the destination can be a domain, subdomain, or wildcard characters in the domain name. This allows you to create application rules for internet services with public IP addresses or a range of Azure services such as *, *, *, and *
When both NSG rules and firewall policies exist, Data Integration considers both.


We’d like to hear from you!