This section provides information of terms used in SSA-NAME3.
Accept-Limit is the score above which a candidate record is considered an accepted match. The Match Decision returned is set to "A". It is combined with Reject Limit such that records attaining a score between the Accept and Reject limits have a Match Decision set to U (Undecided). It is pre-defined in a Population rule-set, and can be overridden by the search application.
The set of records returned from a name search. Each candidate should be compared with the original search record using the ssan3_match function and the accepted, and optionally the suspect, records displayed or otherwise further processed.
A name field which explicitly refers to more than one simple name, for example:
INFORMATICA CORPORATION dba IDENTITY SYSTEMS
JOHN SMITH and GEORGE BROWN
The keywords, values and parameters that govern how an API Function will operate and which are passed through the API. The format and contents of the controls will depend on the type of function being used. More information can be found in the Controls section of the
Standard Population rules modified by Informatica for a user with special search and matching requirements and re-packaged as a Custom Population. These may also have been built from scratch for a totally new type of population.
Special characters contained within the data which separate distinct fields or keywords.
A Java-based GUI used by the developer for understanding and testing the various API Functions, accessing online documentation, and generating sample programs.
Edit Rule Wizard
A Java GUI tool that helps a business user safely add certain types of Edit Rules to the Standard or Custom Population without requiring specific knowledge of SSA-NAME3 or support from a programmer or data analyst. The types of rules that can be added using this tool are:
Discard a word or phrase when searching and matching (e.g. a new "noise" word)
Add a new replacement word or phrase when searching and matching (e.g. a new "abbreviation" or "acronym")
Add a new compound name marker word
A field that is used for Matching, and not used for Key or Range building, that supports Edit-list rules. Examples of this are:
Efields benefit from Edit-lists to overcome such problems as when an ID number that contains all 9’s should be considered a "null" number. Therefore, a rule would be required to treat all 9’s as a noise word.
A description of any error which may have occurred during an API Function call.
For high-risk and critical search applications, this is the Key Level to be used when generating SSANAME3 Keys. In contrast to Standard Keys, Extended Keys include keys built from additional token concatenation.
The data retrieved from the file as a result of finding a set of candidate records using the key ranges returned from the
function. The File Data is compared with the Search Data by the
function to calculate a Score and Match decision.
The process of discarding candidates that fail to meet a certain Score threshold or are deemed "Rejected". This reduces the number of records that need to be passed back over the network, shown to the user or further processed by a program.
An SSA-NAME3 API Function which is called from the application to perform a distinct task. For example,
will generate SSA-NAME3 Keys;
will generate a Key Ranges Array;
will match a pair of records and return a Score and Match Decision. These functions and more are defined in detail in the
In the context of SSA-NAME3, Fuzzy Keys is a term that refers to the special SSA-NAME3 Keys built from names or addresses that have been treated by a variety of techniques to overcome the error and variation in the data.
A single character word or the first character of a word.
The field used to build SSA-NAME3 Keys using the
function. In SSA-NAME3 Standard Populations, the supported Key Fields are Person_Name, Organization_Name and Address_Part1.
Key Field Data
The value(s) of the field used to build SSA-NAME3 Keys using the
function call. The keys generated from the call are stored by the user’s application in a user-defined "SSA-NAME3 Key Table" within the database. The SSA-NAME3 Key Table is designed by the user’s DBA. The SSANAME3 Keys column must be indexed.
When using the
function from a search application, the Key Field Data will contain the value(s) of the field used to initiate the search.
Key Field Data may consist of one value or several repeating values. Examples of repeating values are: a person’s name and their maiden name; a residential address and a mailing address.
The process whereby a user application calls the
function to generate SSA-NAME3 Keys from the Key Field Data (typically a name or an address). The application will then store these keys in a database table referred to as the SSA-NAME3 Key Table.
The number of keys returned from the
function call. This value is used by the application code to ensure that all of the keys returned are stored in the SSA-NAME3 Key Table.
Refers to the number and variety of keys to be generated by an
call. The three Key Levels are Standard keys, Extended keys and Limited keys.
If disk space is limited, SSA-NAME3 can generate "Limited" SSA-NAME3 Keys. Limited keys are a subset of Standard keys. However, the designer/developer should be aware that the use of Limited keys, while saving on disk space, may also reduce search reliability.
A Standard Population or Custom Populationthat has been modified locally via either the Population Override Manageror Edit RuleWizard.
The word in a Name identified as being the most significant word. In some Search Strategies, it is used as the primary part of the Search key ranges, and for extra weighting in some Match Purposes (e.g. family name in a Household Purpose).
The process whereby a user application calls the
function to compare two records, usually a Search and a File record, and compute a Score and Match Decision.
The ultimate business purpose of the search/match application. This will be provided as a parameter to the
function. Examples of Match Purposes are "same name", "same individual", "same resident", "same household", "same organization", "same division", "same corporate entity", "same contact", "same address".
A 1-byte character value which identifies the judgment on the matched records. Values are "A" for Accept, "U" for Undecided and "R" for Reject. The thresholds by which these decisions are chosen can be varied by the user.
Used in defining the level of Matching to be performed for a particular search application. In most Standard Populations, possible values are Conservative, Typical and Loose. The three possible values allow adjustment to the "tightness" of the match.
Any word in a Name which is not the Major word.
The name of a person, company, business or organization; an address; a product title, song title or book title; any short description. A name consists of a number of words and optionally codes, each with a limit of 24 characters.
An internal setting that specifies at what end of a name or address (Left or Right) the Major Wordcan be found. This can be overridden by an application program in certain API calls.
Words that do not contribute to, and can impede, a search or match function. Such words are removed when SSA-NAME3 processes a name through an
call. Examples of Noise Words are Personal Titles (e.g. Mr., Mrs.), Street Types (, Rd.) and Company legal endings (e.g. Inc., Ltd.). As with other Edit rules, Noise Words are population specific and vary according to what Standard Population is being used.
This typically refers to the country and language of the data to be used in the search system; however, populations can be both super- and sub-sets of country and language populations. An example of super-set population is the combination of all Western European populations into the one population for searching and matching. An example of a sub-set population is the USA’s OFAC list of Specially Designated Nationals.
Population Override Manager
A Java GUI tool that helps a trained data analyst override some of the Standard Population rules that are supplied with the product, or provided in the form of a Custom Population. The types of rules that can be overridden using this tool are:
Scalar Frequency Tables
Use of this tool without proper training from Informatica is not recommended, as improper use can adversely affect the reliability and performance of the search application(s).
Returned from a call to the
function. The Ranges Array is a set of "Start" and "End" SSA-NAME3 Key values. These should be used by the user’s search application to form a set of SQL select statements that retrieve records within those ranges.
The number of in the Ranges Array. This is the number of ranges which the calling program must process.
The process of sorting the Matched candidates in descending order by Score in order to display the records to the user in descending order of their likeness to the search identity.
Reject-Limit is the score below which a candidate record is considered a rejected match. The Match Decision returned is set to "R". It is combined with Accept Limit such that records attaining a score between the Accept and Reject limits have a Match Decision set to U (Undecided). It is pre-defined in a Population rule-set, and can be overridden by the search application.
A measure of the likelihood that a Search Strategy will find a name in the database if one exists that should be considered a match to the search name.
Refers to the Standard, Extended or Limited SSA-NAME3 Keys computed when a name or address is processed by the
function. They are referred to as "required" because all of the keys must be stored in the database table.
The default SSA-NAME3 Keys are 8 bytes in length and consist of printable characters. An option exists to generate 5-byte binary keys if your database supports such keys. The application program will store these key values in a separate table within the database specifically designed and optimized by your DBA for searching and matching. This table will be sorted and indexed on the column storing the SSA-NAME3 Keys.
Gives an indication of the validity of a call to SSA-NAME3. A Response Code value of zero indicates a successful call. If the Response Code is not zero, then a description of the problem will be reported in the Error Message parameter.
Scatter / Gather Data Format
This is a method of formatting the input data when using
A numeric value between 000 and 100 returned from the
call. It indicates how close a match was achieved after comparing the Search Data and File Data. The actual Score returned will depend on the Match Purpose, Match Level and the Search and File Data being compared.
A numeric value between 000 and 100 that defines the threshold for the Match Decisionfor a specific Match Purposeand Match Levelfor a given Population rule-set. Score Limits are pre-defined in the Population rule-sets, and can be overridden by the calling program.
The transaction data which contains the search information. It will contain the field value used to drive the search (that is, used in the
call) as well as all of the available data to be compared with the File Data during the
The method by which a search application receives search data from an input screen, processes the Ranges Array generated from the search data, and displays the ranked records back to the user.
Used in defining the type of Search Strategyto use for a particular search application. In most Standard Populations, possible values are Narrow, Typical, Exhaustive and Extreme. The four possible values allow adjustment to the "thoroughness" of the search. The wider the search, the more candidates are typically returned, which may increase the reliability of the search but also use more resources and take longer.
The combination of Key Field and Search Level passed to the
function to generate the Ranges Array.
The percentage of the database (that is, number of candidates / total number of database rows) that is retrieved to satisfy a particular search.
SSA-NAME3’s intelligent keys are computed when a name or address is processed by the
function. SSA-NAME3 Keys can be of 3 types: Standard Keys, Extended Keys or Limited Keys. The Keys are 8 bytes in length and consist of printable characters. An option exists to generate 5-byte binary keys. The application program will store these key values in a new table within the database or in a new indexed file specifically designed and optimized for searching and matching. This table will be sorted and indexed on the column storing the SSA-NAME3 Keys.
For typical applications, this is the Key Level to be used when generating SSA-NAME3 Keys. Standard Keys overcome more variation than Limited Keys while using less disk space than Extended Keys. High-risk and critical applications, however, should use Extended keys.
Standard Population (SP)
Standard algorithms which support various searching and matching rules and requirements, typically for a specific language and country. Note: all Standard Populations are delivered with the product, however a separate license is required to use the double-byte character sets covered by the
Describes the use for the Name Search, for example, your project name. The System name is used to define the name of a folder or sub-directory where the Standard or Custom Population for this system should be stored and secured.
Tagged Data Format
This is a method of formatting the input data when using
In Tagged Format, the offsets and lengths of the data fields being passed do not need to be specified. Instead, a notation of labels and delimiters is used to break up the fields. By default the delimiter is an asterisk but it can be user defined.
The word or code components of a Name or Address.
The combination of hardware and operating system that will host the application that calls SSANAME3 and accesses the database.