Configuring YARN in Informatica Big Data Management®

Configuring YARN in Informatica Big Data Management®

Informatica Developer Tasks

Informatica Developer Tasks

After the Hadoop administrator configures YARN queues on the Hadoop cluster, the Informatica developer can direct Blaze, Spark, and Hive jobs to run on specific queues. If the developer does not direct jobs to specific queues, jobs run on the default queue.
Optionally, the Informatica developer can complete the following tasks:
Direct Blaze jobs to a queue.
To direct Blaze jobs to a specific queue, configure the following Blaze configuration property in the Hadoop connection:
Property
Description
YARN Queue Name
The YARN scheduler queue name used by the Blaze engine that specifies available resources on a cluster.
Direct Spark jobs to a queue.
To direct Spark jobs to a specific queue, configure the following Spark configuration property in the Hadoop connection:
Property
Description
YARN Queue Name
The YARN scheduler queue name used by the Spark engine that specifies available resources on a cluster. The name is case sensitive.
Direct Hive jobs to a queue.
To direct MapReduce or Tez jobs on the Hive engine to a specific queue, configure the following Hive connection property:
Property
Description
Environment SQL
SQL commands to set the Hadoop environment.
Use the following format:
  • MapReduce.
    mapred.job.queue.name=<YARN queue name>
  • Tez.
    tez.queue.name=<YARN queue name>
For example,
mapred.job.queue.name=root.test
Direct Sqoop jobs on the Spark engine to a queue.
To direct Sqoop jobs for mappings that run on the Spark engine to a specific queue, configure the following JDBC connection property:
Property
Description
Sqoop Arguments
The Sqoop connection-level argument to direct a MapReduce job for a Sqoop mapping to a specific YARN queue.
Use the following format:
-Dmapred.job.queue.name=<YARN queue name>
Direct SQL override mappings on the Blaze engine to a queue.
To direct SQL override mappings that run on the Blaze engine to a specific queue, configure the following property in the Hive connection:
Property
Description
Data Access Connection String
The Hive connection string to specify the queue name for Hive SQL override mappings on the Blaze engine.
Use the following format:
  • MapReduce.
    mapred.job.queue.name=<YARN queue name>
  • Tez.
    tez.queue.name=<YARN queue name>
For example,
jdbc:hive2://business.com:10000/default;principal=hive/_HOST@INFAKRB?mapred.job.queue.name-root.test

0 COMMENTS

We’d like to hear from you!