Data Quality Performance Tuning Guide

Data Quality Performance Tuning Guide

Multithreading

Multithreading

You can run mappings in a multi-threaded or parallel manner. The Execution Options on the Data Integration Service define the maximum number of parallel mappings that the service can run.
The following image shows the maximum parallelism option on the Data Integration Service:
You can view the Maximum Parallelism value as an Execution Options property on the Data Integration Service.
Parallelism also applies to the transformations within a mapping. Set the maximum permitted parallelism within a given mapping as a run-time property on the mapping.
The following image shows the maximum parallelism option on the mapping:
You can view the property on the Run-time tab of the mapping in the Developer tool.
All data quality transformations can be multithreaded except for the Exception, Association, Classifier, and Consolidation transformations. You can configure a Decision transformation to be partitionable.
The following graphs show the increase in throughput that you can achieve by enabling partitioning on a Standardizer and Parser transformation:
Similar performance increases are observed for other data quality transformations.
Consider the following rules and guidelines for parallelism:
  • The number of execution instances that you set on a Match transformation or Address Validator transformation must not exceed the maximum parallelism values that you set on the Data Integration Service or on the mapping that contains the transformation.
  • If you set the maximum parallelism value on a mapping to Auto, the mapping uses the maximum parallelism value on the Data Integration Service. This may result in diminished performance, depending on the number of mappings that run concurrently.

0 COMMENTS

We’d like to hear from you!