HDInsight clusters are made up of the following types of nodes, each of which corresponds to a VM:
Head node
The head node of a cluster controls distribution of processing tasks to other nodes in the cluster. The head node runs Hadoop services, including HDFS, YARN, Hive metastore, application timeline server (ATS), and others.
Worker nodes
The worker nodes in a cluster perform processing tasks. You can specify any number of worker nodes.
You can manually scale the number of worker nodes up or down. HDInsight does not support auto-scaling.
Gateway nodes
In addition to the head and worker nodes, each cluster includes two gateway nodes that run management and security tasks. Users do not have access to gateway nodes.
Cluster workflows enable you to automate the creation of cluster and run specified mappings. The cluster workflow can include a task to delete the cluster when processing is complete. For more information about cluster workflows, see the
Big Data Management User Guide.
The following node types support cluster workflow operations on HDInsight:
A series nodes:
A3
A4
A7
DS_v2 series nodes:
D5_v2
D12_v2
D13_v2
D14_v2
DS_v2 series nodes are General Purpose type.
The following image shows the Cluster Size Options properties in the Advanced properties tab of the create cluster task:
You can also specify other cluster node types when you configure the Create Cluster task. Manually enter a valid node type. For example, type
Standard_D4_v2
.
For a list of valid node types, and more information about the specifications of available cluster node types, see the
Azure HDInsight documentation.
The following image shows an example where the user has manually typed an alternate node size for the Head Node VM Size and Worker Node VM Size properties: