Configuring YARN in Informatica Big Data Management®

Configuring YARN in Informatica Big Data Management®

Example. Directing Spark Jobs to a Queue

Example. Directing Spark Jobs to a Queue

You work at an organization that runs a majority of data processing jobs on the Spark engine. To ensure that Spark jobs have access to cluster resources, you direct Spark jobs to the queue
Spark_only
.
To set the YARN queue for Spark jobs, you can configure the following property in the Hadoop connection:
This image shows the Spark configuration properties in the Hadoop connection. The value Spark_only is set for the property YARN Queue Name.
The following image shows how a scheduler directs Spark jobs to the queue that you configured:
This image shows a group of jobs that are submitted. The jobs go through a YARN scheduler and are assigned to a queue. If the job is a Spark job, the job is assigned to the Spark_only queue. If the job is not a Spark job, it is assigned to the default queue.
The submitted Spark job is directed to the queue
Spark_only
. The submitted Blaze and Hive jobs are directed to the default queue.

0 COMMENTS

We’d like to hear from you!