Developer Transformation Guide

10.4.0
- 10.5.2
- 10.5
- 10.4.1

Back Next

Match Pairs and Clusters

The Match transformation can read and write different numbers of input rows and output rows, and it can change the sequence of the output rows. You determine the output format for the results of the match analysis.

The transformation can write rows in the following formats:

Matched pairs: The transformation writes a row for every pair of records that match with a score that meets the match threshold. The transformation writes each pair of records to a single row.; Because a record might match more than one other record, a record might appear on more than one output row.

Best match: The transformation writes a row for each record in a data set and adds the most similar record from another data set to the same row.
Clusters: The transformation assigns the output records to clusters based on the levels of similarity between the records. A cluster is a set of records in which each record matches at least one other record with a score that meets the match threshold. The transformation writes each record to a single row.; Each record in a cluster must match at least one other record in the cluster. Therefore, a cluster can contain pairs of records that do not match each other. A cluster can contain a single record if the record does not match any other record.; The Clusters option in field analysis corresponds to the Clusters - Match All option in identity analysis. The Clusters - Best Match option in identity analysis combines cluster calculations and matched pair calculations.