Table of Contents

Search

  1. Preface
  2. Introduction to Test Data Management
  3. Test Data Manager
  4. Projects
  5. Policies
  6. Data Discovery
  7. Creating a Data Subset
  8. Performing a Data Masking Operation
  9. Data Masking Techniques and Parameters
  10. Plans and Workflows
  11. Monitor
  12. Reports
  13. ilmcmd
  14. Data Type Reference
  15. Data Type Reference for Hadoop

Shuffle Masking

Shuffle Masking

Shuffle masking masks the data in a column with data from the same column in another row of the table. Shuffle masking switches all the values for a column in a file or database table. You can restrict which values to shuffle based on a lookup condition or a constraint. Mask date, numeric, and string data types with shuffle masking.
For example, you might want to switch the first name values from one customer to another customer in a table. The table includes the following rows:
100 Tom Bender 101 Sue Slade 102 Bob Bold 103 Eli Jones
When you apply shuffle masking, the rows contain the following data:
100 Bob Bender 101 Eli Slade 102 Tom Bold 103 Sue Jones
You can configure shuffle masking to shuffle data randomly or you can configure shuffle masking to return repeatable results.
For Hive and HDFS data sources, you can use shuffle masking only when the source is a relational database and the target is Hive or HDFS.
You cannot use shuffle masking when both the source and the target use Hadoop HDFS connections.
If the source file might have empty strings in the shuffle column, set the
Null and Empty Spaces
option to Treat as Value in the rule exception handling. When you set the option to Treat as Value, the
Data Integration Service
masks the space or the null value with a valid value. The default is to skip masking the empty column.