You can configure the following Data Integration Service properties:
Maximum Hadoop batch pool size
Dtermines the maximum number of deployed jobs that you can run concurrently in the Hadoop environment. The Data Integration Service moves deployed Hadoop jobs from the queue to the Hadoop batch pool when enough resources are available. The default value is 100.
Maximum native batch pool size
Determines the maximum number of jobs that you can run concurrently in the native environment. The Data Integration Service moves deployed native jobs from the queue to the native batch pool when enough resources are available. The default value is 10.
Maximum on-demand pool size
Determine the maximum number of on-demand jobs that you can run concurrently. On-demand jobs include data previews, profiling jobs, SQL queries, and web service requests. The Data Integration Service immediately runs on-demand jobs if enough resources are available. Otherwise, the Data Integration Service rejects the job. The default value is 10.
When the Data Integration Service runs on a grid, the maximum number of deployed and on-demand jobs that can run concurrently across the grid are calculated as follows:
Maximum Hadoop batch pool size * Number of running service processes
Maximum native batch pool size * Number of running service processes
Maximum on-demand pool size * Number of running service processes
For example, a Data Integration Service grid includes three running service processes. If you set the Hadoop batch pool size to 10, each Data Integration Service process can run up to 10 deployed Hadoop jobs concurrently. A total of 30 deployed Hadoop jobs can run concurrently on the grid. If you try to run more than 30 Hadoop jobs, the Data Integration Service queues the jobs until there is space in the pool.
When you increase the pool size values, the Data Integration Service uses more hardware resources such as CPU, memory, and system I/O. Set these values based on the resources available on the nodes in the grid. For example, consider the number of CPUs on the machines where Data Integration Service processes run and the amount of memory that is available to the Data Integration Service.
If the Data Integration Service grid runs jobs in separate remote processes, additional concurrent jobs might not run on compute nodes after you increase the value of these properties. You might need to override compute node attributes to increase the number of concurrent jobs on each compute node. For more information, see
Override Compute Node Attributes to Increase Concurrent Jobs