A dictionary is a reference table that contains the substitute data and a serial number for each row in the table. Create a reference table for substitution masking from a flat file or relational table that you import into the Model repository.
The Data Masking transformation generates a number to retrieve a dictionary row by the serial number. The Data Masking transformation generates a hash key for repeatable substitution masking or a random number for non-repeatable masking. You can configure an additional lookup condition if you configure repeatable substitution masking.
You can configure a dictionary to mask more than one port in the Data Masking transformation.
When the Data Masking transformation retrieves substitution data from a dictionary, the transformation does not check if the substitute data value is the same as the original value. For example, the Data Masking transformation might substitute the name John with the same name (John) from a dictionary file.
The following example shows a dictionary table that contains first name and gender:
SNO
GENDER
FIRSTNAME
1
M
Adam
2
M
Adeel
3
M
Adil
4
F
Alice
5
F
Alison
In this dictionary, the first field in the row is the serial number, and the second field is gender. The Integration Service always looks up a dictionary record by serial number. You can add gender as a lookup condition if you configure repeatable masking. The Integration Service retrieves a row from the dictionary using a hash key, and it finds a row with a gender that matches the gender in the source data.
Use the following rules and guidelines when you create a reference table:
Each record in the table must have a serial number.
The serial numbers are sequential integers starting at one. The serial numbers cannot have a missing number in the sequence.
The serial number column can be anywhere in a table row. It can have any label.
If you use a flat file table to create the reference table, use the following rules and guidelines:
The first row of the flat file table must have column labels to identify the fields in each record. The fields are separated by commas. If the first row does not contain column labels, the Integration Service takes the values of the fields in the first row as column names.
If you create a flat file table on Windows and copy it to a UNIX machine, verify that the file format is correct for UNIX. For example, Windows and UNIX use different characters for the end of line marker.