Shuffle masking masks the data in a column with data from the same column in another row of the table. Shuffle masking switches all the values for a column in a file or database table. You can restrict which values to shuffle based on a lookup condition or a constraint. Mask date, numeric, and string data types with shuffle masking.
For example, you might want to switch the first name values from one customer to another customer in a table. The table includes the following rows:
100 Tom Bender
101 Sue Slade
102 Bob Bold
103 Eli Jones
When you apply shuffle masking, the rows contain the following data:
100 Bob Bender
101 Eli Slade
102 Tom Bold
103 Sue Jones
You can configure shuffle masking to shuffle data randomly or you can configure shuffle masking to return repeatable results.
For Hive and HDFS data sources, you can use shuffle masking only when the source is a relational database and the target is Hive or HDFS.
You cannot use shuffle masking when both the source and the target use Hadoop HDFS connections.
If the source file might have empty strings in the shuffle column, set the
Null and Empty Spaces
option to Treat as Value in the rule exception handling. When you set the option to Treat as Value, the
Data Integration Service
masks the space or the null value with a valid value. The default is to skip masking the empty column.