Table of Contents


  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings in the Hadoop Environment
  4. Mapping Sources in the Hadoop Environment
  5. Mapping Targets in the Hadoop Environment
  6. Mapping Transformations in the Hadoop Environment
  7. Processing Hierarchical Data on the Spark Engine
  8. Configuring Transformations to Process Hierarchical Data
  9. Processing Unstructured and Semi-structured Data with an Intelligent Structure Model
  10. Stateful Computing on the Spark Engine
  11. Monitoring Mappings in the Hadoop Environment
  12. Mappings in the Native Environment
  13. Profiles
  14. Native Environment Optimization
  15. Cluster Workflows
  16. Connections
  17. Data Type Reference
  18. Function Reference
  19. Parameter Reference

Enable Scheduling and Node Labeling

Enable Scheduling and Node Labeling

To enable scheduling and node labeling in the Hadoop environment, update the yarn-site.xml file in the domain environment.
Configure the following properties:
Defines the YARN scheduler that the Data Integration Service uses to assign resources on the cluster.
<property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.[Scheduler Type].[Scheduler Type]Scheduler</value> </property>
For example:
<property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value> </property>
Enables node labeling.
<property> <name>yarn.node-labels.enabled</name> <value>TRUE</value> </property>
The HDFS location to update the node label dynamically.
<property> <name>yarn.node-labels.fs-store.root-dir</name> <value>hdfs://[Node name]:[Port]/[Path to store]/[Node labels]/</value> </property>

Updated October 23, 2019