Table of Contents

Search

  1. Preface
  2. Introduction to Test Data Management
  3. Test Data Manager
  4. Projects
  5. Policies
  6. Data Discovery
  7. Creating a Data Subset
  8. Performing a Data Masking Operation
  9. Data Masking Techniques and Parameters
  10. Data Generation
  11. Data Generation Techniques and Parameters
  12. Working with Test Data Warehouse
  13. Analyzing Test Data with Data Coverage
  14. Plans and Workflows
  15. Monitor
  16. Reports
  17. ilmcmd
  18. tdwcmd
  19. tdwquery
  20. Data Type Reference
  21. Data Type Reference for Test Data Warehouse
  22. Data Type Reference for Hadoop
  23. Glossary

Substitution Masking

Substitution Masking

Substitution masking replaces a column of data with similar but unrelated data from a dictionary. Mask date, numeric, and string data types with substitution masking.
Use substitution masking to mask string data with realistic output. For example, if you want to mask address data, you specify a dictionary file that contains addresses. If you want to mask a Social Security number, you can specify the InvalidSSN dictionary file that contains Social Security numbers that are not valid.
Substitution is an effective way to replace production data with realistic test data. When you configure substitution masking, select the relational or flat file dictionary that contains the substitute values. The PowerCenter Integration Service performs a lookup on the dictionary and replaces source data with data from the dictionary. You can use relational dictionary to mask Hadoop data.
When you assign a substitution masking rule to a column, you can specify the rule assignment parameters.
The following table describes the rule assignment parameters that you can configure:
Parameter
Description
Lookup Condition
The column name in the source table you can refer to match with the column in the dictionary. This field is optional.
Unique Substitution Column
The column name in the source table to substitute with unique data. This field is optional.
You can substitute data with repeatable or non-repeatable values. When you choose repeatable values, the PowerCenter Integration Service produces deterministic results for the same source data and seed value. You must configure a seed value to substitute data with deterministic results. The PowerCenter Integration Service maintains a storage table of source and masked values for repeatable masking. You can specify the storage table you want to use when you generate a workflow.
You cannot use flat file dictionaries and unique substitution masking to mask Hadoop data.