Using the Data Quality Accelerator for Crisis Response

Using the Data Quality Accelerator for Crisis Response

Contact Data Cleansing Rules

Contact Data Cleansing Rules

Use the contact data cleansing rules to parse, standardize, and validate data about business contacts and individuals. Find the contact address data cleansing rules in the following repository location:
[Project_Name]\Rules\Contact_Data_Cleansing
The following table describes the contact data cleansing rules:
Name
Description
rule_Email_Parse
Parses email addresses from data fields.
rule_Email_Parse_and_Validate
Parses email addresses from data fields and validates the format of each email address.
rule_Email_Parse_Into_Mailbox_Domain
Parses email addresses into mailbox, domain, and subdomain fields. For example, the rule parses info@informatica.com in the following manner: - Mailbox: info - Subdomain: informatica - Domain: com
rule_Email_Validation
Validates the format of email addresses. The rule does not verify that the email addresses are accurate or active. The rule returns "Valid" or "Invalid."
rule_Identify_Suspect_Names
Identifies names that might not be genuine person names. The rule compares the input values to a reference table of names that are unlikely to be genuine. For example, the reference table includes the names of fictional characters.
rule_Prename_Assignment
Generates an honorific according to the gender. You can change the female_prename expression variable from Ms. to Mrs.
rule_Salutation_Assignment
Generates formal and casual greetings from prenames and name tokens. For example, when input data contains "Mr. John Smith," the rule generates the formal greeting "Dear Mr. Smith," and the casual greeting "Dear John,". You can change the prefix and punctuation by editing the variables in the dq_Generate_Salutation Expression transformation.
rule_USA_Gender_Assignment
Assigns gender according to first name. The rule returns "M" for male names, "F" for female names, and "U" if the gender is unknown. For example, the rule assigns the name "John Smith" a gender of "M" for male.
rule_USA_Given_Name_Standard
Generates given names from U.S. nicknames. For example, the rule standardizes the nickname "Bob" to the given name "Robert."
rule_USA_Multi_Person_Name_Parse
Parses person name values into separate fields. The rule creates fields for values such as title, first name, middle name, and surname. The rule output includes a field that contains the full name of the person in the record. You can use the full name field as an input to a Match transformation in an identity match analysis mapping. When the name data identifies more than one person, the rule creates an output field for each full name. For example, the rule can read the name "John and Jane Smith" and create output fields for "John Smith" and "Jane Smith."
rule_USA_Personal_Name_Parse_and_Standardize_FML
Parses the values in a person name into separate fields. The rule also standardizes the name values. The rule creates the fields in the following sequence: First name, middle name, last name The rule output also includes a field that contains the full name of the person in the record. You can use the full name field as an input to a Match transformation in an identity match analysis mapping.
rule_USA_Personal_Name_Parse_and_Standardize_LFM
Parses the values in a person name into separate fields. The rule also standardizes the name values. The rule creates the fields in the following sequence: Last name, first name, middle name The rule output also includes a field that contains the full name of the person in the record. You can use the full name field as an input to a Match transformation in an identity match analysis mapping.
rule_USA_Personal_Name_Parse_Validation
Validates the gender assignment for a name. The rule calculates the probabilities that a data value is a male name or a female name. If the gender is unknown, the rule uses the probability calculations to assign a gender to the name.
rule_USA_Personal_Name_Parsing_FML
Parses the values in a person name into separate fields. The rule creates the fields in the following sequence: - First name, middle name, last name The rule output also includes a field that contains the full name of the person in the record. You can use the full name field as an input to a Match transformation in an identity match analysis mapping. Note: The rule does not standardize the name values. To standardize and parse United States name values in the sequence that the rule defines, select rule_USA_Personal_Name_Parse_and_Standardize_FML.
rule_USA_Personal_Name_Parsing_LFM
Parses the values in a person name into separate fields. The rule creates the fields in the following sequence: - Last name, first name, middle name The rule output also includes a field that contains the full name of the person in the record. You can use the full name field as an input to a Match transformation in an identity match analysis mapping. Note: The rule does not standardize the name values. To standardize and parse United States name values in the sequence that the rule defines, select rule_USA_Personal_Name_Parse_and_Standardize_LFM.
rule_USA_Phone_Number_Parse
Parses a United States telephone number from a string. The rule parses the first telephone number in the data, reading from right to left. The rule returns a telephone number and also returns a string that contains the input text with the telephone number removed.
rule_USA_Phone_Number_Standardization
Standardizes United States telephone numbers. The rule returns the telephone number in the following formats: - Standard: (nnn) nnn-nnnn - Dashes: nnn-nnn-nnnn - No Spaces: nnnnnnnnnn
rule_USA_Phone_Number_Validation
Validates the area code and length of United States telephone numbers. The rule returns values that indicate if the area code and length of a telephone number are valid.
rule_USA_Phone_Parse_Standardize_Validate
Parses a telephone number from a string of text and verifies that the area code is valid for the United States. If the area code is valid, the rule returns the telephone number in three formats. The rule also returns a status value to indicate whether the data conforms to the standard format for a United States telephone number.
rule_USA_Phone_w_Extension_Parse
Parses a number from a string of text if the number conforms to the standard format for a United States telephone number. The rule includes any telephone extension data when it parses the telephone number.
rule_USA_SSN_Parse
Parses United States Social Security numbers (SSN) from data.
rule_USA_SSN_Parse_Standardize_and_Validate
Parses, standardizes, and validates United States Social Security numbers from a larger string of text. The rule can parse numbers that include or omit dashes. By default, the rule writes Social Security numbers without any punctuation. To change the standardization format, open the dq_SSN_Format transformation in the rule and update the expression on the SSN_Format field.
rule_USA_SSN_Standardization
Standardizes United States Social Security numbers. The rule can output the following formats: - No Punctuation - nnnnnnnnn - Space - nnn nnn nnn - Dash - nnn-nnn-nnn To change the format, edit the SSN_Format expression variable in the dq_SSN_Format Expression transformation. Default is "No_Punctuation."
rule_USA_SSN_Validation
Validates United States Social Security numbers. The rule validates each Social Security number for length, numeric values, and known minimum and maximum values in the Area, Group, and Serial Number sections. The Area section comprises the first three digits of the number, and the Group section comprises the fourth and fifth digits. The Serial Number section comprises the final four digits. If the number was issued prior to June 2011, the rule also verifies that the Area value and Group value are a valid combination. The rule does not verify that the number is an issued number. The rule returns "Valid" or "Invalid."
rule_USA_SSN_Validation_post_June2011
Validates United States Social Security numbers. The rule validates each Social Security number for length, numeric values, and known minimum and maximum values in the Area, Group, and Serial Number sections. The Area section comprises the first three digits of the number, and the Group section comprises the fourth and fifth digits. The Serial Number section comprises the final four digits. The rule does not verify that the Area value and Group value are a valid combination. The rule does not verify that the number is an issued number. The rule returns "Valid" or "Invalid."

0 COMMENTS

We’d like to hear from you!