Consider the following best practices for mappings that contain a Data Processor transformation:
For mappings that run on the Blaze engine, it is required to enable partitioning for Data Processor transformations. To enable partitioning, perform the following steps:
In the Developer tool, open the mapping and select the Data Processor transformation.
In the
Properties
view, click the
Advanced
tab.
Select
Enable partitioning for Data Processor transformations
.
The following image shows the Advanced tab of a Data Processor transformation:
When a mapping with a Data Processor transformation meets all of the following conditions, the Blaze engine processes the entire mapping in a single tasklet:
The mapping source file is of a non-splittable input format.
The transformation contains multiple output groups.
The Data Processor transformation might output a higher data volume than the source. For such scenarios, configure the Blaze engine to first stage the data generated by the transformation at each output group.
The following image shows a mapping with a Data Processor transformation with multiple output groups:
To stage data at every output group, set the following mapping run-time property in the Developer tool:
Parameter
Value
Blaze.StageOutputGroupDataForInstances
The name of the Data Processor transformation instance.
When the Blaze engine is configured to first stage the data, it performs the following tasks:
Re-partitions the data.
Processes the staged data.
Creates the correct number of tasklets based on the staged data volume.