A Data Integration Service that runs on a grid can run DTM instances in the Data Integration Service process, in separate DTM processes on the local node, or in separate DTM processes on remote nodes. Configure the service based on the types of jobs that the service runs.
Configure a Data Integration Service grid based on the following types of jobs that the service runs:
SQL data services and web services
When a Data Integration Service grid runs SQL queries and web service requests, you can configure the service to run jobs in the Data Integration Service process. You can also configure SQL data service and web service jobs to run in separate DTM processes on the local node. All nodes in the grid must have both the service and compute roles. The Data Integration Service dispatches jobs to available nodes in a round-robin fashion.
SQL data service and web service jobs typically achieve better performance when the Data Integration Service runs jobs in the service process.
Mappings, profiles, and workflows that run in local mode
When a Data Integration Service grid runs mappings, profiles, and workflows, you can configure the service to run jobs in separate DTM processes on the local node. All nodes in the grid must have both the service and compute roles.
The Data Integration Service dispatches jobs to available nodes in a round-robin fashion.
When the Data Integration Service runs jobs in separate local processes, stability increases because an unexpected interruption to one job does not affect all other jobs.
Mappings, profiles, and workflows that run in remote mode
When a Data Integration Service grid runs mappings, profiles, and workflows, you can configure the service to run jobs in separate DTM processes on remote nodes. The nodes in the grid can have a different combination of roles.
The Data Integration Service designates one node with the compute role as the master compute node. The Service Manager on the master compute node communicates with the Resource Manager Service to dispatch jobs to an available worker compute node. The Resource Manager Service matches job requirements with resource availability to identify the best compute node to run the job.
When the Data Integration Service runs jobs in separate remote processes, stability increases because an unexpected interruption to one job does not affect all other jobs. In addition, you can better use the resources available on each node in the grid. When a node has the compute role only, the node does not have to run the service process. The machine uses all available processing power to run mappings.
Ad hoc jobs, with the exception of profiles, can run in the Data Integration Service process or in separate DTM processes on the local node. Ad hoc jobs include mappings run from the Developer tool or previews, scorecards, or drill downs on profile results run from the Developer tool or Analyst tool. If you configure a Data Integration Service grid to run jobs in separate remote processes, the service runs ad hoc jobs in separate local processes.
By default, each Data Integration Service is configured to run jobs in separate local processes, and each node has both the service and compute roles.
If you run SQL queries or web service requests, and you run other job types in which stability and scalability is important, create multiple Data Integration Services. Configure one Data Integration Service grid to run SQL queries and web service requests in the Data Integration Service process. Configure the other Data Integration Service grid to run mappings, profiles, and workflows in separate local processes or in separate remote processes.