Configure the Databricks cluster to improve concurrency of jobs.
When you submit a job to Databricks, it allocates resources to run the job. If it does not have enough resources, it puts the job in a queue. Pending jobs fail if resources do not become available before the timeout of 30 minutes.
You can configure preemption on the cluster to control the amount of resources that Databricks allocates to each job, thereby allowing more jobs to run concurrently. You can also configure the timeout for the queue and the interval at which the Databricks Spark engine checks for available resources.
Configure the following environment variables for the Databricks Spark engine:
spark.databricks.preemption.enabled
Enables the Spark scheduler for preemption. Default is false.
Set to: true
spark.databricks.preemption.threshold
A percentage of resources that are allocated to each submitted job. The job runs with the allocated resources until completion. Default is 0.5, or 50 percent.
Set to a value lower than default, such as 0.1.
spark.databricks.preemption.timeout
The number of seconds that a job remains in the queue before failing. Default is 30.
Set to: 1,800.
If you set a value higher than 1,800, Databricks ignores the value and uses the maximum timeout of 1,800.
spark.databricks.preemption.interval
The number of seconds to check for available resources to assign to a job in the queue. Default is 5.
Set to a value lower than the timeout.
Changes take effect after you restart the cluster.
Informatica integrates with Databricks, supporting standard concurrency clusters. Standard concurrency clusters have a maximum queue time of 30 minutes, and jobs fail when the timeout is reached. The maximum queue time cannot be extended. Setting the preemption threshold allows more jobs to run concurrently, but with a lower percentage of allocated resources, the jobs can take longer to run. Also, configuring the environment for preemption does not ensure that all jobs will run. In addition to configuring preemption, you might choose to run cluster workflows to create ephemeral clusters that create the cluster, run the job, and then delete the cluster. For more information about Databricks concurrency, contact Azure Databricks.