The cluster workflow that creates an ephemeral cluster includes a Create Cluster task, at least one Mapping task, and a Delete Cluster task.
The following image shows a sample cluster workflow:
A cluster workflow uses the following components:
Cloud provisioning configuration
The cloud provisioning configuration is associated with the Create Cluster task through the cluster connection.
Cluster connection
The cluster connection to use with a cluster workflow is associated with a cloud provisioning configuration. You can use a Hadoop or Databricks cluster connection. When you run the workflow the Data Integration Service creates a temporary cluster connection.
Create Cluster task
The Create Cluster task contains all the settings that the cloud platforms require to create a cluster with a master node and worker nodes. It also contains a reference to a cloud provisioning configuration. Include one Create Cluster task in a cluster workflow.
Mapping task
Add a big data mapping to the Mapping task. A cluster workflow can include more than one mapping task. You can run some mappings on an existing cluster and you can run some mappings on a cluster that the workflow creates. You configure the mappings and Mapping tasks based on where you want to run the task.
Delete Cluster task
The Delete Cluster task terminates the cluster and deletes the cluster and all resources that the workflow creates.