Table of Contents

Search

  1. Preface
  2. Working with Transformations
  3. Aggregator Transformation
  4. Custom Transformation
  5. Custom Transformation Functions
  6. Data Masking Transformation
  7. Data Masking Examples
  8. Expression Transformation
  9. External Procedure Transformation
  10. Filter Transformation
  11. HTTP Transformation
  12. Identity Resolution Transformation
  13. Java Transformation
  14. Java Transformation API Reference
  15. Java Expressions
  16. Java Transformation Example
  17. Joiner Transformation
  18. Lookup Transformation
  19. Lookup Caches
  20. Dynamic Lookup Cache
  21. Normalizer Transformation
  22. Rank Transformation
  23. Router Transformation
  24. Sequence Generator Transformation
  25. Sorter Transformation
  26. Source Qualifier Transformation
  27. SQL Transformation
  28. Using the SQL Transformation in a Mapping
  29. Stored Procedure Transformation
  30. Transaction Control Transformation
  31. Union Transformation
  32. Unstructured Data Transformation
  33. Update Strategy Transformation
  34. XML Transformations

Transformation Guide

Transformation Guide

Substituting Data with the Lookup Transformation

Substituting Data with the Lookup Transformation

You can substitute a column of data with similar but unrelated data. Substitution is an effective way to mask sensitive data with a realistic looking data.
The following example shows how to configure multiple Lookup transformations to retrieve test data and substitute it for source data. Create a Data Masking mapping to mask the sensitive fields in CUSTOMERS_PROD table.
The example includes the following types of masking:
  • Name and address substitution from Lookup tables
  • Key masking
  • Blurring
  • Special mask formats
This example is the M_CUSTOMERS_MASKING.xml mapping that you can import to your repository from the
client\samples
folder.
A customer database table called Customers_Prod contains sensitive data. You want to use the customer data in a test scenario, but you want to maintain security. You mask the data in each column and write the test data to a target table called Customers_Test.
The Customers_Prod table contains the following columns:
Column
Datatype
CustID
Integer
FullName
String
Address
String
Phone
String
Fax
String
CreatedDate
Date
Email
String
SSN
String
CreditCard
String
You can create a mapping that looks up substitute values in dictionary files. The Data Masking transformation masks the customer data with values from the dictionary files. The files include a first name file, a surname file, and an address file.
The following table lists the files that are located in the
server\infa_shared\LkpFiles
folder:
File
Number of Records
Fields
Description
Firstnames.dic
21,000
SNO, Gender, Firstname
Alphabetical list of first names. The serial number goes from 1 to 21,000. Gender indicates whether the name is male or female.
Surnames.dic
81,000
SNO, Surname
Alphabetical list of last names. The serial number goes from 1 to 81,000.
Address.dic
13,000
SNO, Street, City, State, Zip, Country
List of complete addresses. The serial number goes from 1 to 13,000.
Informatica includes the gender column in the Firstnames.dic file so you can create separate lookup source files by gender. You can use the gender column in a lookup condition if you need to mask with a male name and a female name with a female name.
The following figure shows the mapping that you can import:
""
The mapping has the following transformations along with a source and target:
  • Source Qualifier.
    Passes customer data to the Data Masking transformation. It passes the CustID column to multiple ports in the transformation
    :
    • CustID
      . Customer number.
    • Randid1
      . Random number generator for first name lookups.
    • Randid2
      . Random number generator for last name lookups.
    • Randid3
      . Random number generator for address lookups.
  • Data Masking transformation.
    Creates random numbers to look up the replacement first name, last name, and address. Applies special mask formats to the phone number, fax, email address, and credit card number. The Data Masking transformation masks the following columns:
    Input Port
    Masking Type
    Masking Rules
    Description
    Output Destination
    CustID
    Key
    Seed = 934
    CustID is the primary key column. It must be masked with a random number that is repeatable and deterministic.
    Customers_Test
    Randid1
    Random
    Range
    Minimum = 0
    Maximum = 21000
    Random number for first name lookup in the LKUP_Firstnames transformation.
    LKUP_Firstnames
    Randid2
    Random
    Range
    Minimum = 0
    Maximum = 13000
    Random number for last name lookup in the LKUP_Surnames transformation.
    LKUP_Surnames
    Randid3
    Random
    Range
    Minimum = 0
    Maximum = 81000
    Random number for address lookup in the LKUP_Address transformation.
    LKUP_Address
    Phone
    Phone
    -
    Phone number has the same format as the source phone number.
    Customers_Test
    Fax
    Phone
    -
    Phone number has the same format as the source phone number.
    Customers_Test
    CreatedDate
    Random
    Blurring
    Unit = Year
    Low Bound = 1
    High Bound = 1
    Random date that is within a year of the source year.
    Customers_Test
    Email
    Email Address
    -
    Email address has the same format as the original.
    Customers_Test
    SSN
    SSN
    -
    SSN is not in the highgroup.txt file.
    Customers_Test
    CreditCard
    Credit Card
    -
    Credit card has the same first six digits as the source and has a valid checksum.
    Customers_Test
  • LKUP_Firstnames.
    Performs a flat file lookup on Firstnames.dic. The transformation retrieves the record with the serial number equal to the random number Randid1. The lookup condition is:
    SNO = out_RANDID1
    The LKUP_Firstnames transformation passes the masked first name to the Exptrans Expression transformation.
  • LKUP_Surnames.
    Performs a flat file lookup on the Surnames.dic file. It retrieves the record with the serial number equal to Randid2. The LKUP_Firstnames transformation passes a masked last name to the Exptrans Expression transformation
    .
  • Exptrans.
    Combines the first and last name and returns a full name. The Expression transformation passes the full name to the Customers_Test target
    .
    The Expression to combine the first and last names is:
    FIRSTNAME || ' ' || SURNAME
  • LKUP_Address
    . Performs a flat file lookup on the Address.dic file. It retrieves the address record with the serial number equal to Randid3. The Lookup transformation passes the columns in the address to the target.
You can use the Customer_Test table in a test environment.

0 COMMENTS

We’d like to hear from you!