Provision an HDInsight cluster in the resource group that you created for the Big Data Management implementation. The cluster must be in the same region as the storage that it accesses, which you created in step 2.6.
A Hadoop cluster consists of several virtual machines (nodes) that are used for distributed processing of tasks. Azure HDInsight is a cloud distribution of the Hadoop components from the Hortonworks Data Platform (HDP). Big Data Management integrates with HDInsight to push processing of mappings and other jobs to the cluster.
When you provision an HDInsight cluster, attach a Hive metastore database. The Hive metastore is a semantic repository that stores metadata for Hive tables, enabling Hadoop operations on the cluster.