Enhanced batching

Enhanced batching

With the previous release already the message queue communication for data quality rule executions and business process management intercommunication has been introduced to increase system robustness in high load scenarios. This has been enhanced to also include the support of batching for DQ and merge processes now, to further improve performance especially in mass data and high load operation scenarios.

Data quality batching

Until the current release data quality message queue requests were processed one at a time. In order to have a better performance and resource utilization, these requests are now batched automatically within the same message queue. This leads to a performance improvement up to a factor of 5; especially in scenarios where many single requests to DQ are sent in a very short period of time. E.g. an item change event that was executed on thousands of single objects.
  • Without batching, a sample set of 20k item records is taking approximately 25 minutes to complete the execution for a rich set of data quality rules if each item is triggered for execution individually (e.g. on mass item change events).
  • With batching, the same set of 20k item records executing the same data quality rules, completed in our benchmark tests within approximately 5 minutes.
Results from the test scenario above:
Operation
Prior version no batching
10.1 with batching
Improvement
Sample data set 20k item records
25 minutes
5 minutes
5x
There is no change to the structure of the data quality request and response messages. The setting for the threshold can be defined in the
plugin_customization.ini
:
plugin_customization.ini
# --------------------------------------------------------------------------- # Message Batch preferences # --------------------------------------------------------------------------- # Specifies the size(number of items) of data quality message batch # Default: 500 #dataquality.message.batch.threshold=500

Merge batching

Similar to DQ batching Product 360 10.1 enables the batching of merge processes. This especially increases performance in scenarios where many single merge requests are sent within a short period of time (e.g. on the last step of an enrichment workflow to build the golden record in the master catalog).
Merge requests written to the Service API message queue or direct REST calls will still be processed as single requests. Only merge requests via batching queue will be considered.

Finish (approve) trigger batching

After batching of DQ and merge processes the next logical step in this improvement initiative was the batching of finish events for workflow tasks. This ensures a much better responsiveness of the application when users finish or approve a bunch of objects within a workflow task in one go. Instead of sending e.g. 1,000 individual finish messages only 1 finish message which contains all 1,000 objects will be sent. This especially saves resources on the BPM server but also within the message queue itself.

0 COMMENTS

We’d like to hear from you!