The Deduplicate transformation adds a deduplicate asset that you created in
Data Quality
to a mapping.
Use a Deduplicate transformation to analyze the levels of duplication in a data set and optionally to consolidate sets of duplicate records into a single, preferred record. Deduplicate transformations analyze the
identity
information in the records. An identity is a group of data values in a record that identify a person or an organization.
Deduplication and consolidation are useful operations in the following types of data project:
Customer Relationship Management. For example, a store designs a mail campaign and must check the customer database for duplicate customer records.
Regulatory compliance initiatives. For example, a business operates under government or industry regulations that insist all data systems are free of duplicate records.
Financial risk management. For example, a bank may want to search for relationships between account holders.
Any project that must identify or eliminate records that store duplicate identity information.