Data Quality accelerator bundles

Data Quality accelerator bundles

Data Quality bundle for United Kingdom

Data Quality bundle for United Kingdom

Use the Data Quality bundle for United Kingdom to accelerate the configuration and deployment of data quality solutions in United Kingdom organizations. The bundle includes mapplets and other assets that define data standardization, deduplication, address verification, and parsing operations for United Kingdom data.
The following table lists the assets contained in the Data Quality bundle for United Kingdom:
Name
Asset Type
Description
c_GBR_Get_Company_Acronym
Cleanse
Standardizes acronyms for United Kingdom company names.
c_GBR_Get_Company_Name_Std
Cleanse
Standardizes United Kingdom company names.
c_GBR_given_name_standard
Cleanse
Standardizes given names and removes extraneous character spaces.
c_GBR_Remove_Intl_Dial_Code
Cleanse
Removes variations of the United Kingdom international dialing code from the start of the input string. For example,
+44
and
0044
c_GBR_Standardize_Company_Suffix
Cleanse
Standardizes United Kingdom company name suffixes.
c_Remove_Extra_Spaces
Cleanse
Replaces multiple consecutive spaces with a single space and trims leading and trailing spaces.
c_Remove_Intl_call_Prefix
Cleanse
Removes the international dialing prefix from telephone numbers.
c_remove_labels
Cleanse
Removes the characters
X
and
9
.
c_Remove_Leading_Zero
Cleanse
Removes the first character from a number if the character is a zero.
c_Remove_Numbers
Cleanse
Removes numbers from a string.
c_Remove_Period_Parentheses
Cleanse
Removes all periods and left and right parentheses.
c_Remove_Punctuation
Cleanse
Removes all punctuation from the input field.
c_Remove_Space
Cleanse
Removes all occurrences of a character space from the input field.
c_replace_ltd_punct_w_space
Cleanse
Replaces a limited set of punctuation symbols with character spaces.
c_replace_numbers_alpha_w_9X
Cleanse
Replaces all digits with
9
and all alphabetical characters with
X
.
c_SR_FullName
Cleanse
Standardizes full names.
c_Titlecase
Cleanse
Converts the input text to title case.
c_Uppercase
Cleanse
Converts the input text to uppercase.
c_Validation_Domain
Cleanse
Validates the domain of an email address.
dedupe_GBR_CompanyName_Postcode_Match
Deduplicate
Identifies duplicate records in United Kingdom data based on company names and postal codes.
dedupe_GBR_Familyname_NINO_Match
Deduplicate
Identifies duplicate records in United Kingdom data based on surnames and National Insurance Numbers (NINO).
dedupe_GBR_Familyname_Postcode_Match
Deduplicate
Identifies duplicate records in United Kingdom data based on surnames and United Kingdom postal codes.
dedupe_GBR_FN_3CharSN_DOB_Zip_Match
Deduplicate
Identifies duplicate records in United Kingdom data based on the following data:
  • First name
  • The first three characters in the surname
  • Date of birth
  • Postal code
dedupe_GBR_FN_SN_2EDOB_Postcode_Match
Deduplicate
Identifies duplicate records in United Kingdom data based on the following data:
  • Person name
  • Any two date of birth elements, such as month and year
  • United Kingdom postal code
dedupe_GBR_FN_SN_DOB_Postcode_Match
Deduplicate
Identifies records based on the following data:
  • Person name
  • Date of birth
  • Postal code
dedupe_GBR_IMO_Comp_Name_Addr_Match
Deduplicate
Identifies duplicate records in United Kingdom data based on company names and addresses.
dedupe_GBR_Individual_Name_and_Email
Deduplicate
Identifies duplicate records in United Kingdom data based on person names and email address data.
Dedupe_GBR_Individual_Name_and_NINO
Deduplicate
Identifies duplicate records in United Kingdom data based on person names and National Insurance Numbers (NINO).
Dedupe_GBR_Name_Postcode_Match
Deduplicate
Identifies duplicate records in United Kingdom data based on person names and postcode data.
domain_names_infa
Dictionary
Contains a list of domain names.
dq_av_geocodingstatus_infa
Dictionary
Contains geocoding status values and their descriptions.
dq_av_match_code_descriptions_infa
Dictionary
Contains alphanumeric status values that indicate the outcome of the verification operation for an address. Each status value has a corresponding text description.
gbr_bank_sort_codes_infa
Dictionary
Contains United Kingdom bank sort codes.
gbr_company_acronyms_infa
Dictionary
Contains United Kingdom company name acronyms.
gbr_company_name_sample
Dictionary
Contains sample United Kingdom company names.
gbr_company_names_infa
Dictionary
Contains a list of United Kingdom company names.
gbr_company_sufx_abrv_infa
Dictionary
Contains company suffix abbreviations in use in the United Kingdom.
gbr_gender_infa
Dictionary
Contains a list of names and corresponding genders in the United Kingdom.
gbr_nicknames_infa
Dictionary
Contains a list of United Kingdom nicknames.
gbr_prename_gender_infa
Dictionary
Contains name prefixes and the gender associated with each prefix.
gbr_tel_area_codes_infa
Dictionary
Contains a list of United Kingdom telephone area codes.
title_case_excptn_infa
Dictionary
Contains a list of titles that precede personal names.
lbl_GBR_Area_Code_2Digits
Labeler
Labels two-digit United Kingdom telephone area codes.
lbl_GBR_Area_Code_3Digits
Labeler
Labels three-digit United Kingdom telephone area codes.
lbl_GBR_Area_Code_4Digits
Labeler
Labels four-digit United Kingdom telephone area codes.
lbl_GBR_NINO
Labeler
Labels United Kingdom National Insurance Numbers (NINO).
mplt_Assign_DQ_GeocodingStatus_Description
Mapplet
Assigns the text description of a Geocoding Accuracy Code value that a verifier asset generates.
mplt_Assign_DQ_Match_Code_Description
Mapplet
Assigns the text description of a Verification Status Code value that a verifier asset generates.
mplt_Email_Validation
Mapplet
Validates the format of email addresses. The mapplet does not verify that the email addresses are accurate or active. Returns Valid or Invalid.
mplt_GBR_Address_Validation_Discrete
Mapplet
Validates the deliverability of United Kingdom addresses. Use the mapplet when you can connect the input address fields to fields on the Discrete model in the verifier asset.
mplt_GBR_Address_Validation_Discrete_w_Geocoding
Mapplet
Validates the deliverability of United Kingdom addresses and adds latitude and longitude coordinates to each valid address. Use the mapplet when you can connect the input address fields to fields on the Discrete model in the verifier asset.
mplt_GBR_Address_Validation_Hybrid
Mapplet
Validates the deliverability of United Kingdom addresses. Use the mapplet when you can connect the input address fields to fields on the Hybrid model in the verifier asset.
mplt_GBR_Address_Validation_Hybrid_Geocoding
Mapplet
Validates the deliverability of United Kingdom addresses and adds latitude and longitude coordinates to each valid address. Use the mapplet when you can connect the input address fields to fields on the Hybrid model in the verifier asset.
mplt_GBR_Address_Validation_Multiline
Mapplet
Validates the deliverability of United Kingdom addresses. Use the mapplet when you can connect the input address fields to fields on the Multiline model in the verifier asset.
mplt_GBR_Address_Validation_Multiline_w_Geocoding
Mapplet
Validates the deliverability of United Kingdom addresses and adds latitude and longitude coordinates to each valid address. Use the mapplet when you can connect the input address fields to fields on the Multiline model in the verifier asset.
mplt_GBR_Bank_Account_Parse
Mapplet
Parses an eight-digit string as a United Kingdom bank account number from a string.
mplt_GBR_Bank_Account_Validation
Mapplet
Validates United Kingdom bank account numbers. The mapplet checks if the input is numeric and if the value is eight characters long.
mplt_GBR_Bank_Sort_Code_Parse
Mapplet
Parses six-digit numeric strings as United Kingdom bank sort codes. The mapplet parses strings of numbers in the following formats:
  • Consecutive numbers (999999)
  • Numbers delimited with a dash (99-99-99)
mplt_GBR_Bank_Sort_Code_Standardize
Mapplet
Standardizes a United Kingdom bank sort code to the format NN-NN-NN.
mplt_GBR_Bank_Sort_Code_Validation
Mapplet
Validates the format and length of United Kingdom bank sort codes that are standardized to the dash-delimited format (99-99-99). The mapplet returns a Status field that describes the validity of the sort code and a Validation Note field that explains the status. If the sort code prefix matches a known assignment for a United Kingdom bank, the validation note includes the bank name.
mplt_GBR_Company_Name_Postcode_Match
Mapplet
Identifies duplicate records in United Kingdom data based on company names and postal codes. The mapplet generates group keys from the postal codes.
mplt_GBR_Company_Name_Standardization
Mapplet
Standardizes a company name and provides the acronym for the name if it is possible to do so.
mplt_GBR_Contact_Data
Mapplet
Parses, standardizes, and validates United Kingdom contact data, such as addresses, telephone numbers, and National Insurance Numbers (NINO).
mplt_GBR_Driver_Number_Parse
Mapplet
Parses strings that match the format of United Kingdom driver's license numbers from a string.
mplt_GBR_Driver_Number_Validation
Mapplet
Validates United Kingdom driver's license numbers based on the requirements of the United Kingdom Government Data Standards Catalogue.
mplt_GBR_Familyname_NINO_Match
Mapplet
Identifies duplicate records in United Kingdom data based on surnames and National Insurance Numbers (NINO). The mapplet generates group keys from the NINO data.
mplt_GBR_Familyname_Postcode_Match
Mapplet
Identifies duplicate records in United Kingdom data based on surnames and United Kingdom postal codes. The mapplet generates group keys from the postal code data.
mplt_GBR_Firstname_3CharsSurname_DOB_Postcode_Match
Mapplet
Identifies duplicate records in United Kingdom data based on the following data:
  • First name
  • The first three characters in the surname
  • Date of birth
  • Postal code
The mapplet generates group keys from the postal code data.
mplt_GBR_Firstname_Surname_2ElementsDOB_Postcode_Match
Mapplet
Identifies duplicate records in United Kingdom data based on the following data:-
  • Person names
  • Any two date of birth elements, such as month and year
  • United Kingdom postal code
The mapplet generates group keys from the postal code data.
mplt_GBR_Firstname_Surname_DOB_Postcode_Match
Mapplet
Identifies duplicate records based on the following data:
  • Person names
  • Date of birth
  • Postal code
The mapplet generates group keys from the postal code data.
mplt_GBR_Gender_Assignment
Mapplet
Assigns gender to first names. The mapplet returns M for male names, F for female names, and U if the gender is unknown. For example, the mapplet assigns the name John Smith a gender of M for male.
mplt_GBR_Given_Name_Standard
Mapplet
Generate given names from United Kingdom nicknames. For example, the mapplet standardizes the nickname Bob to the given name Robert.
mplt_GBR_IMO_Company_Name_Address_Match
Mapplet
Identifies duplicate records in United Kingdom data based on company names and addresses.
mplt_GBR_Individual_Name_and_Email_Match
Mapplet
Identifies duplicate records in United Kingdom data based on person names and email address data. The mapplet generates group keys from the email address data.
mplt_GBR_Individual_Name_and_NINO_Match
Mapplet
Identifies duplicate records in United Kingdom data based on person names and National Insurance Numbers (NINO). The mapplet generates group keys from the NINO data.
mplt_GBR_Individual_Name_and_Postcode_Match
Mapplet
Identifies duplicate records in United Kingdom data based on person names and postal code data. The mapplet generates group keys from the postal code data.
mplt_GBR_Multi_Person_Name_Parse
Mapplet
Parses person name values into separate fields. The mapplet creates fields for values such as title, first name, middle name, and surname.
mplt_GBR_NHS_Number_Parse
Mapplet
Parses National Health Service (NHS) numbers from a string.
mplt_GBR_NHS_Number_Standardize
Mapplet
Standardizes National Health Service (NHS) numbers into a standard format (999 999 9999). The mapplet requires that the input is a 10-digit string.
If the input value does not yield a 10-digit value, the mapple returns the input value.
mplt_GBR_NHS_Number_Validate
Mapplet
Validates National Health Service (NHS) numbers based on the check digit for each number. The mapplet requires that the input is a 10-digit string.
mplt_GBR_NINO_Conformity_Check
Mapplet
Validates that the input value conforms to a standard pattern for United Kingdom National Insurance Numbers (NINO). The mapplet does not verify that a NINO is accurate or active.
mplt_GBR_NINO_Parse
Mapplet
Parses United Kingdom National Insurance Numbers (NINO) from strings. The mapplet returns the NINO and also returns a string that contains the input text with the NINO removed.
mplt_GBR_NINO_Standardization
Mapplet
Standardizes United Kingdom National Insurance Numbers (NINO) into the two most typical formats. The mapplet returns the following formats, where C represents alphabetic characters and N represents numerals:
  • CC NN NN NN C
  • CCNNNNNNC
The mapplet formats all alphabetic characters as uppercase.
mplt_GBR_NINO_Validation
Mapplet
Validates a United Kingdom National Insurance Number (NINO).
mplt_GBR_Passport_Number_MR_Parse
Mapplet
Parses United Kingdom passport numbers that appear in the machine-readable extended format.
mplt_GBR_Passport_Number_Parse
Mapplet
Parses nine-digit strings that conform to the United Kingdom passport number format that the Government Data Standards Catalogue specifies.
mplt_GBR_Passport_Number_Validation
Mapplet
Validates nine-digit strings that conform to the United Kingdom passport number format that the Government Data Standards Catalogue specifies.
mplt_GBR_Personal_Name_Parsing_FML
Mapplet
Parses person name values into separate fields.
The mapplet creates the fields in the following sequence:
  • First name, middle name, last name
mplt_GBR_Personal_Name_Parsing_LFM
Mapplet
Parses the person name values into separate fields.
The mapplet creates the fields in the following sequence:
  • Last name, first name, middle name
mplt_GBR_Phone_Number_Parse
Mapplet
Parses a United Kingdom telephone number from a string.
The mapplet recognizes telephone numbers that use leading zeros, the +44 international dialing code, and extensions that begin with the hash symbol. The mapplet processes the following punctuation symbols: the plus sign, parentheses, and the hash symbol. Before you run the mapplet, remove all other punctuation, including double spaces.
mplt_GBR_Phone_Number_Standardization
Mapplet
Standardizes United Kingdom telephone numbers to international and local dialing formats. The mapplet recognizes telephone numbers that use leading zeros, the +44 international dialing code, and extensions that begin with the hash symbol.
mplt_GBR_Phone_Number_Validation
Mapplet
Validates the area code and length of United Kingdom telephone numbers. The mapplet can also return the region that the area code identifies.
mplt_GBR_Postcode_Parse
Mapplet
Parses United Kingdom postal codes from a string.
mplt_GBR_Postcode_Standardise
Mapplet
Standardizes United Kingdom postal codes. The mapplet requires that the input follows predefined formats.
The mapplet standardizes inputs that match the following patterns:
  • A9 9AA
  • A99 9AA
  • AA9 9AA
  • AA99 9AA
  • A9A 9AA
  • AA9A 9AA
  • GIR 0AA
If the input value does not match the above patterns, the mapplet returns the input value.
mplt_GBR_Postcode_Validate
Mapplet
Validates United Kingdom postal codes. If the mapplet does not find a postal code, it verifies whether the input follows a standard United Kingdom postal code pattern.
mplt_LowerCase
Mapplet
Converts input text to lowercase characters.
mplt_Prename_Assignment
Mapplet
Assigns a name prefix based on gender. Includes a variable that can change a female prefix. For example, Ms. or Mrs.
mplt_Remove_Extra_Spaces
Mapplet
Replaces multiple consecutive character spaces with a single space, and trims leading and trailing spaces.
mplt_Remove_Leading_Zero
Mapplet
Removes a leading zero from a number.
mplt_Remove_Period_Parantheses
Mapplet
Removes all occurrences of a period and left and right parentheses.
mplt_Remove_Punctuation
Mapplet
Removes all punctuation symbols.
mplt_Remove_Punctuation_and_Space
Mapplet
Removes all punctuation symbols and character spaces.
mplt_Remove_Space
Mapplet
Removes all occurrences of a character space.
mplt_Replace_Limited_Punct_with_Space
Mapplet
Replaces punctuation symbols, including a forward slash, backslash, exclamation point, period, or underscore character, with a character space. Also replaces instances of two, three, or four consecutive character spaces with a single space.
mplt_Salutation_Assignment
Mapplet
Generates formal and casual greetings from prefixes and names. For example, when input data contains "Mr. John Smith," the mapplet generates the formal greeting "Dear Mr. Smith," and the casual greeting "Dear John,". You can change the prefix and punctuation by editing the variables in the dq_Generate_Salutation Expression transformation.
mplt_UpperCase
Mapplet
Changes alphabetic characters to uppercase.
p_Assign_Firstname_Gender
Parse
Assigns a gender value based on the first name.
p_Assign_Midname_Gender
Parse
Assigns a gender value based on the middle name.
p_Assign_Prename_Gender
Parse
Assigns a gender value based on the name prefix.
p_GBR_driver_number
Parse
Parses a United Kingdom driver's license number from a string.
p_GBR_NINO
Parse
Parses a United Kingdom National Insurance Number (NINO) using a regular expression.
p_GBR_ParseFullName
Parse
Parses a full name from the input field.
p_GBR_Postcode_Validate_Format
Parse
Parses a United Kingdom postcode that matches a United Kingdom postcode format.
p_GBR_Postcode
Parse
Parses a United Kingdom postcode using a regular expression.
p_last_alphabet
Parse
Parses the last alphabetical character from a string.
p_NHS_Number
Parse
Parses a United Kingdom National Health Service (NHS) number using regular expressions.
p_Parse_Assign_GeocodingStatus_Desc
Parse
Parses geocoding status descriptions.
p_Parse_GBR_Acct_Number
Parse
Parses a United Kingdom account number.
p_Parse_GBR_Area_Code_2Digit
Parse
Parses two-digit telephone area codes for the United Kingdom.
p_Parse_GBR_Area_Code_3Digit
Parse
Parses three-digit telephone area codes for the United Kingdom.
p_Parse_GBR_Area_Code_4Digit
Parse
Parses four-digit telephone area codes for the United Kingdom.
p_Parse_GBR_Bank_Sort_Code
Parse
Parses United Kingdom bank sort codes.
p_Parse_GBR_Match_Bank_Sort_Codes
Parse
Parses a United Kingdom bank sort code from the input data when an input value matches a value in a dictionary.
p_Parse_GBR_Name_Components_FML
Parse
Parses name components for United Kingdom: F
p_Parse_GBR_Name_Components_LFM
Parse
Parses name components for United Kingdom: L
p_Parse_GBR_Passport_Number
Parse
Parses United Kingdom passport numbers.
p_Parse_GBR_Passport_Number_MR
Parse
Parses machine-readable United Kingdom passport numbers.
p_Parse_GBR_Phone
Parse
Parses United Kingdom telephone numbers.
p_UK_Postcode_Validate
Parse
Parses United Kingdom postcodes using a regular expression.
p_Word
Parse
Extracts the first alphabetical string from the input field where the alphabetical string is defined by a leading or trailing character space. For example, extracts
Check
from
Check Test
,
testword
from
testword 123
, or
Check
from
123 Check
.
rs_Assign_DQ_Match_Code_Description
Rule Specification
Returns the text description of a Verification Status Code value that a verifier asset generates.
av_GBR_AddressValidation_Discrete
Verifier
Verifies the deliverability of United Kingdom addresses. Use the asset when you can connect the input address fields to fields on the Discrete model.
av_GBR_AddressValidation_Discrete_w_Geo
Verifier
Verifies the deliverability of United Kingdom addresses and adds latitude and longitude coordinates to each valid address. Use the asset when you can connect the input address fields to fields on the Discrete model.
av_GBR_AddressValidation_Hybrid
Verifier
Verifies the deliverability of United Kingdom addresses. Use the asset when you can connect the input address fields to fields on the Hybrid model.
av_GBR_Hybrid_w_Gecoding
Verifier
Verifies the deliverability of United Kingdom addresses and adds latitude and longitude coordinates to each valid address. Use the asset when you can connect the input address fields to fields on the Hybrid model.
av_GBR_AddressValidation_Multiline_Geo
Verifier
Verifies the deliverability of United Kingdom addresses and adds latitude and longitude coordinates to each valid address. Use the asset when you can connect the input address fields to fields on the Multiline model.
av_GBR_Parse_Multiline_Address
Verifier
Parses unstructured United Kingdom addresses into address elements. Use the asset when you can connect the input address fields on the Multiline model.

0 COMMENTS

We’d like to hear from you!