Accelerator Guide

10.5
- 10.5.4
- 10.4.0

Back Next

U.S./Canada Contact Data Cleansing Rules

Use contact data cleansing rules to parse, standardize, and validate data about business contacts and individuals.

Find the contact data cleansing rules in the following repository location:

[Informatica_DQ_Content]\Rules\Contact_Data_Cleansing

The following table describes the contact data cleansing rules in the U.S./Canada accelerator:

Name	Description
rule_CAN_Gender_Assignment	Assigns gender according to first names. The rule returns "M" for male names, "F" for female names, and "U" if the gender is unknown. For example, the rule assigns the name "John Smith" a gender of "M" for male.
rule_CAN_Given_Name_Standard	Generate given names from Canadian nicknames. For example, the rule standardizes the nickname "Bob" to the given name "Robert."
rule_CAN_Multi_Person_Name_Parse	Parses person name values into separate ports. The rule creates ports for values such as title, first name, middle name, and surname. The rule output includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping. When the name data identifies more than one person, the rule creates an output port for each full name. For example, the rule can read the name "John and Jane Smith" and create output ports for "John Smith" and "Jane Smith."
rule_CAN_Personal_Name_Parse_and_Standardize_FML	Parses the values in a person name into separate ports. The rule also standardizes the name values. The rule creates the ports in the following sequence: First name, middle name, last name The rule output also includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping.
rule_CAN_Personal_Name_Parse_and_Standardize_LFM	Parses the values in a person name into separate ports. The rule also standardizes the name values. The rule creates the ports in the following sequence: Last name, first name, middle name The rule output also includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping.
rule_CAN_Personal_Name_Parsing_FML	Parses the values in a person name into separate ports. The rule creates the ports in the following sequence: Last name, first name, middle name The rule output also includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping. The rule does not standardize the name values. To standardize and parse Canadian name values in the sequence that the rule defines, select rule_CAN_Personal_Name_Parse_and_Standardize_FML.
rule_CAN_Personal_Name_Parsing_LFM	Parses the values in a person name into separate ports. The rule creates the ports in the following sequence: Last name, first name, middle name The rule output also includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping. The rule does not standardize the name values. To standardize and parse Canadian name values in the sequence that the rule defines, select ule_CAN_Personal_Name_Parse_and_Standardize_LFM.
rule_CAN_Phone_Number_Parse	Parses a Canadian telephone number from a string. The rule parses the first telephone number in the data, reading from right to left. The rule returns a telephone number and also returns a string that contains the input text with the telephone number removed.
rule_CAN_Phone_Number_Standardization	Standardizes Canadian telephone numbers. The rule returns the telephone number in the following formats: Standard - (nnn) nnn-nnnn Dashes - nnn-nnn-nnnn No Spaces - nnnnnnnnnn
rule_CAN_Phone_Number_Validation	Validates the area code and length of Canadian telephone numbers. The rule returns codes that indicate telephone number type and validity. Types describe categories such as "toll-free."
rule_CAN_Phone_Parse_Standardize_Validate	Parse a telephone number from a string of text and verifies that the area code is valid for Canada. If the area code is valid, the rule returns the telephone number in three standard formats. The rule also returns a status value to indicate whether the data conforms to the standard format for a Canadian telephone number.
rule_CAN_Phone_w_Extension_Parse	Parses a number from a string of text if the number conforms to the standard format for a Canadian telephone number. The rule includes any telephone extension data when it parses the telephone number.
rule_CAN_SIN_Parse	Parses a Canadian Social Insurance Number (SIN) from a string. The rule returns the SIN and also returns a string that contains the input text with the SIN removed.
rule_CAN_SIN_Standardization	Standardizes Canadian Social Insurance Numbers (SIN). The rule can output the following formats: No Punctuation - nnnnnnnnn Space - nnn nnn nnn Dash - nnn-nnn-nnn To change the format, edit the SIN_Format expression variable in the dq_Format_SIN Expression transformation. Default is "No_Punctuation."
rule_CAN_SIN_Validation	Validates Canadian Social Insurance Numbers (SIN). The rule uses the Luhn algorithm to verify whether or not a SIN is valid. The rule returns "Valid" or "Invalid."
rule_Prename_Assignment	Generates an honorific according to the gender. You can change the female_prename expression variable from Ms. to Mrs.
rule_Salutation_Assignment	Generates formal and casual greetings from prenames and name tokens. For example, when input data contains "Mr. John Smith," the rule generates the formal greeting "Dear Mr. Smith," and the casual greeting "Dear John,". You can change the prefix and punctuation by editing the variables in the dq_Generate_Salutation Expression transformation.
rule_USA_Gender_Assignment	Assigns gender according to first name. The rule returns "M" for male names, "F" for female names, and "U" if the gender is unknown. For example, the rule assigns the name "John Smith" a gender of "M" for male.
rule_USA_Given_Name_Standard	Generate given names from U.S. nicknames. For example, the rule standardizes the nickname "Bob" to the given name "Robert."
rule_USA_Multi_Person_Name_Parse	Parses person name values into separate ports. The rule creates ports for values such as title, first name, middle name, and surname. The rule output includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping. When the name data identifies more than one person, the rule creates an output port for each full name. For example, the rule can read the name "John and Jane Smith" and create output ports for "John Smith" and "Jane Smith."
rule_USA_Personal_Name_Parse_and_Standardize_FML	Parses the values in a person name into separate ports. The rule also standardizes the name values. The rule creates the ports in the following sequence: First name, middle name, last name The rule output also includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping.
rule_USA_Personal_Name_Parse_and_Standardize_LFM	Parses the values in a person name into separate ports. The rule also standardizes the name values. The rule creates the ports in the following sequence: Last name, first name, middle name The rule output also includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping.
rule_USA_Personal_Name_Parse_Validation	Validates the gender assignment for a name. The rule calculates the probabilities that a data value is a male name or a female name. If the gender is unknown, the rule uses the probability calculations to assign a gender to the name.
rule_USA_Personal_Name_Parsing_FML	Parses the values in a person name into separate ports. The rule creates the ports in the following sequence: First name, middle name, last name The rule output also includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping. The rule does not standardize the name values. To standardize and parse United States name values in the sequence that the rule defines, select rule_USA_Personal_Name_Parse_and_Standardize_FML .
rule_USA_Personal_Name_Parsing_LFM	Parses the values in a person name into separate ports. The rule creates the ports in the following sequence: Last name, first name, middle name The rule output also includes a port that contains the full name of the person in the record. You can use the full name port as an input to a Match transformation in an identity match analysis mapping. The rule does not standardize the name values. To standardize and parse United States name values in the sequence that the rule defines, select rule_USA_Personal_Name_Parse_and_Standardize_LFM.
rule_USA_Phone_Number_Parse	Parses a United States telephone number from a string. The rule parses the first telephone number in the data, reading from right to left. The rule returns a telephone number and also returns a string that contains the input text with the telephone number removed.
rule_USA_Phone_Number_Standardization	Standardizes United States telephone numbers. The rule returns the telephone number in the following formats: Standard - (nnn) nnn-nnnn Dashes - nnn-nnn-nnnn No Spaces - nnnnnnnnnn
rule_USA_Phone_Number_Validation	Validates the area code and length of United States telephone numbers. The rule returns codes that indicate if the area code and length of a telephone number are valid.
rule_USA_Phone_Parse_Standardize_Validate	Parse a telephone number from a string of text and verifies that the area code is valid for the United States. If the area code is valid, the rule returns the telephone number in three standard formats. The rule also returns a status value to indicate whether the data conforms to the standard format for a United States telephone number.
rule_USA_Phone_w_Extension_Parse	Parses a number from a string of text if the number conforms to the standard format for a United States telephone number. The rule includes any telephone extension data when it parses the telephone number.
rule_USA_SSN_Parse	Parses United States Social Security Numbers (SSN).
rule_USA_SSN_Parse_Standardize_and_Validate	Parses, standardizes, and validates United States Social Security Numbers from a larger string of text. The rule can parse numbers that include or omit dashes. By default, the rule writes Social Security Numbers without any punctuation. To change the standardization format, open the dq_SSN_Format transformation in the rule and update the expression on the SSN_Format port.
rule_USA_SSN_Standardization	Standardizes United States Social Security Numbers (SSN). The rule can output the following formats: No Punctuation - nnnnnnnnn Space - nnn nnn nnn Dash - nnn-nnn-nnn To change the format, edit the SSN_format expression variable in the dq_SSN_Format Expression transformation. Default is "No_Punctuation."
rule_USA_SSN_Validation	Validates United States Social Security Numbers (SSN). The rule validates each SSN for length, numeric values, and known minimum and maximum values in the Area, Group, and Serial Number sections. The Area section comprises the first three digits of the SSN, and the Group section comprises the fourth and fifth digits. The Serial Number section comprises the final four digits. If the SSN was issued prior to June 2011, the rule also verifies that the Area value and Group value are a valid combination. The rule does not verify that the SSN is an issued number. The rule returns "Valid" or "Invalid."
rule_USA_SSN_Validation_post_June2011	Validates United States Social Security Numbers (SSN). The rule validates each SSN for length, numeric values, and known minimum and maximum values in the Area, Group, and Serial Number sections. The Area section comprises the first three digits of the SSN, and the Group section comprises the fourth and fifth digits. The Serial Number section comprises the final four digits. The rule does not verify that the Area value and Group value are a valid combination. The rule does not verify that the SSN is an issued number. The rule returns "Valid" or "Invalid."