Table of Contents

Search

  1. Preface
  2. Introduction to Transformations
  3. Transformation Ports
  4. Transformation Caches
  5. Address Validator Transformation
  6. Aggregator Transformation
  7. Association Transformation
  8. Bad Record Exception Transformation
  9. Case Converter Transformation
  10. Classifier Transformation
  11. Comparison Transformation
  12. Consolidation Transformation
  13. Data Masking Transformation
  14. Data Processor Transformation
  15. Decision Transformation
  16. Duplicate Record Exception Transformation
  17. Expression Transformation
  18. Filter Transformation
  19. Hierarchical to Relational Transformation
  20. Java Transformation
  21. Java Transformation API Reference
  22. Java Expressions
  23. Joiner Transformation
  24. Key Generator Transformation
  25. Labeler Transformation
  26. Lookup Transformation
  27. Lookup Caches
  28. Dynamic Lookup Cache
  29. Match Transformation
  30. Match Transformations in Field Analysis
  31. Match Transformations in Identity Analysis
  32. Normalizer Transformation
  33. Merge Transformation
  34. Parser Transformation
  35. Python Transformation
  36. Rank Transformation
  37. Read Transformation
  38. Relational to Hierarchical Transformation
  39. REST Web Service Consumer Transformation
  40. Router Transformation
  41. Sequence Generator Transformation
  42. Sorter Transformation
  43. SQL Transformation
  44. Standardizer Transformation
  45. Union Transformation
  46. Update Strategy Transformation
  47. Web Service Consumer Transformation
  48. Parsing Web Service SOAP Messages
  49. Generating Web Service SOAP Messages
  50. Weighted Average Transformation
  51. Window Transformation
  52. Write Transformation
  53. Appendix A: Transformation Delimiters

Developer Transformation Guide

Developer Transformation Guide

Persistent Index Case Study

Persistent Index Case Study

You are a data steward at a retail bank that has multiple branches. You manage a master set of the customer account records from all of the branches. You use a set of index database tables to verify that the customer account database does not contain redundant or duplicate records.
To create and manage the index data store, you perform the following operations:
  • You create the data store.
  • You update the data store with the most recent data from the bank branches.
    You might add account data to the data store, or you might update the current data in the data store.
  • You remove obsolete records from the data store.
You understand that each operation might create duplicate records in the data store. You decide to develop a policy to analyze the branch data before you add the data to the master data store data. You use identity match analysis to analyze the branch data and to verify that the data does not create duplicate identities in the data store. You configure the persistent index options on the Match transformation to analyze the branch data and the data store.

Develop a Policy for Persistent Index Data Management

As a data steward, you define a business rule that states that the customer account data store cannot contain duplicate identities. You design an identity match mapping to analyze the branch data in a staging database before you add the data to the data store.
The operations to add the branch data to the data store can create duplicate identities in the following cases:
  • The branch data contains duplicate identities.
  • The branch data contains an identity that the index also contains.
  • The branch data contains a newer version of an identity in the data store, and the newer version matches another identity in the index.
When you compare the staging database to the data store, select the persistent index options that reflect the duplicate record status of the branch data. Before you update the data store, you might decide to compare the branch data with the index data.
You can enable and disable match analysis on some of the options. Enable match analysis to analyze the mapping data or to compare the index data store to the mapping data. Disable match analysis when you do not need to compare the data. You can also use the Match properties on the Match Output tab to include or exclude data from match analysis.

Compare a Mapping Data Source with the Index Data Store

To compare the mapping input data with the index data store and to make no change to the data store, select the following option:
  • Do not update the database
The mapping compares the input data to the index data store. The mapping does not add, remove, or update any data in index data store.
You cannot disable identity match analysis when you select the option.
Because you do not update the index data, you cannot create duplicate rows in the store. Select the option from the Match properties on the Match Output tab that meets the current needs of the data project. For example, select the
Full
option. The
Full
option verifies that the mapping data does not contain duplicates and verifies that the mapping data does not add duplicates to the data store.
Use the option to compare the mapping data and the data store before you update the data store. If the mapping output indicates that the mapping data does not add duplicates to the data store, run the mapping again. Select the option to update the database when you run the mapping again.

Create the Data Store and Add Rows to the Data Store

To create a data store or to add rows from the mapping data to a data store, select the following option:
  • Update the database with new IDs
The mapping adds a row to the data store if the row does not share a sequence identifier with a row in the data store. The mapping does not overwrite any row in the index tables. When you specify empty database tables, the mapping writes all of the mapping index data to the tables.
You can enable or disable identity match analysis when you select the option. The option enables match analysis by default.
Because you do not update the index rows, select the
Exclusive
option or the
Partial
option from the Match properties on the Match Output tab. Use the
Exclusive
option if you verified the uniqueness of the mapping data rows in an earlier process.

Update the Rows in the Data Store

To update a current row in the data store with the mapping data, select the following option:
  • Update the current IDs in the database
The mapping updates a current record in the data store if the record shares a sequence identifier with a record in the mapping data. The mapping does not add any row to the index tables.
You can enable or disable identity match analysis when you select the option. The option disables match analysis by default.
Because you do not add index rows to the index tables, select the
Full
option from the Match properties on the Match Output tab.
When you update the rows in the data store, you expect to find duplicates between the mapping source data and the data store. Select the
Full
option to verify that the identity data that you add to the store does not match the current data in the store.

Remove Rows from the Data Store

To remove rows from the data store, select the following option:
  • Remove IDs from the database
The mapping deletes a row from the data store if the row shares a sequence identifier with a record in the mapping data.
You can enable or disable identity match analysis when you select the option. The option disables match analysis by default.
When you remove data from a data store, you change the relationships between the rows in the store. If the store contains duplicate identities, you might remove data for a driver record or a linked record in a cluster. Or, you might remove data for the best match in a matched pair. When you run the mapping again, the mapping might generate different clusters or duplicate pairs. If you remove rows from a data store that does not contain duplicate records, you cannot change the duplicate status of the records. When you run the mapping after you delete the rows, the mapping generates the same match scores for the identities that remain in the data set.

0 COMMENTS

We’d like to hear from you!