Table of Contents

Search

  1. Preface
  2. Introduction to Transformations
  3. Transformation Ports
  4. Transformation Caches
  5. Address Validator Transformation
  6. Aggregator Transformation
  7. Association Transformation
  8. Bad Record Exception Transformation
  9. Case Converter Transformation
  10. Classifier Transformation
  11. Comparison Transformation
  12. Consolidation Transformation
  13. Data Masking Transformation
  14. Data Processor Transformation
  15. Decision Transformation
  16. Duplicate Record Exception Transformation
  17. Expression Transformation
  18. Filter Transformation
  19. Hierarchical to Relational Transformation
  20. Java Transformation
  21. Java Transformation API Reference
  22. Java Expressions
  23. Joiner Transformation
  24. Key Generator Transformation
  25. Labeler Transformation
  26. Lookup Transformation
  27. Lookup Caches
  28. Dynamic Lookup Cache
  29. Macro Transformation
  30. Match Transformation
  31. Match Transformations in Field Analysis
  32. Match Transformations in Identity Analysis
  33. Normalizer Transformation
  34. Merge Transformation
  35. Parser Transformation
  36. Python Transformation
  37. Rank Transformation
  38. Read Transformation
  39. Relational to Hierarchical Transformation
  40. REST Web Service Consumer Transformation
  41. Router Transformation
  42. Sequence Generator Transformation
  43. Sorter Transformation
  44. SQL Transformation
  45. Standardizer Transformation
  46. Union Transformation
  47. Update Strategy Transformation
  48. Web Service Consumer Transformation
  49. Parsing Web Service SOAP Messages
  50. Generating Web Service SOAP Messages
  51. Weighted Average Transformation
  52. Window Transformation
  53. Write Transformation
  54. Appendix A: Transformation Delimiters

Developer Transformation Guide

Developer Transformation Guide

Match Pairs and Clusters

Match Pairs and Clusters

The Match transformation can read and write different numbers of input rows and output rows, and it can change the sequence of the output rows. You determine the output format for the results of the match analysis.
The transformation can write rows in the following formats:
Matched pairs
The transformation writes a row for every pair of records that match with a score that meets the match threshold. The transformation writes each pair of records to a single row.
Because a record might match more than one other record, a record might appear on more than one output row.
Best match
The transformation writes a row for each record in a data set and adds the most similar record from another data set to the same row.
Clusters
The transformation assigns the output records to clusters based on the levels of similarity between the records. A cluster is a set of records in which each record matches at least one other record with a score that meets the match threshold. The transformation writes each record to a single row.
Each record in a cluster must match at least one other record in the cluster. Therefore, a cluster can contain pairs of records that do not match each other. A cluster can contain a single record if the record does not match any other record.
The Clusters option in field analysis corresponds to the Clusters - Match All option in identity analysis. The Clusters - Best Match option in identity analysis combines cluster calculations and matched pair calculations.
Configure the output options on the
Match Output
view of the transformation.

0 COMMENTS

We’d like to hear from you!