The Match transformation is an active transformation that analyzes the levels of similarity between records. Use the Match transformation to find records that contain duplicate information in a data set or between two data sets.
The Match transformation analyzes the values on an input port and generates a set of numeric scores that represent the degrees of similarity between the values. You can select multiple ports to determine the overall levels of similarity between the input records. You specify a minimum score as a threshold value to identify the records that are likely to contain duplicate information.
You can use the Match transformation in the following data projects:
Customer Relationship Management. For example, a store designs a mail campaign and must check the customer database for duplicate customer records.
Mergers and acquisitions. For example, a bank buys another bank in the same region, and the two banks have customers in common.
Regulatory compliance initiatives. For example, a business operates under government or industry regulations that insist all data systems are free of duplicate records.
Financial risk management. For example, a bank may want to search for relationships between account holders.
Master data management. For example, a retail chain has a master database of customer records, and each retail store in the chain submits records to the master database on a regular basis.
Any project that must identify duplicate records in a data set.