The following process flow summarizes the steps that you take to configure a Match transformation for identity match analysis. You can define a process that uses the Match transformation alone or that uses the Match transformation and other transformations.
Before you connect the Match transformation to upstream data objects, verify that the records contain unique sequence identifier values. You can use a Key Generator transformation to create the values. When you perform identity match analysis, you can optionally organize the input data into groups.
Perform the following steps in the Match transformation:
Specify identity analysis as the match type, and specify the number of data sources.
If you configure the transformation to analyze two data sets, select a master data set.
Use the
Match Type
view to set the type and the number of data sources.
Identify the location to store the index data. The transformation can write the index data to temporary files or save the index data to database tables.
Use the
Match Type
view to specify the index data store.
Define a match analysis strategy. Select a population and a comparison algorithm, and assign a pair of columns to the algorithm.
The population indicates the column pairs to select.
Use the
Strategies
view to define the strategy.
Specify the method that the transformation uses to generate the match analysis results.
Set the match threshold value. The match threshold is the minimum score that can identify two records as duplicates of one another.
Use the
Match Output
view to select the output method and the match threshold.
You can set the match threshold in a Match transformation or a Weighted Average transformation. Use the Weighted Average transformation if you create a match mapplet.