Introduction to SSA-NAME3 (EXTN) Service Groups

10.5
- 10.5 HotFix 3
- 10.5 HotFix 2
- 10.5 HotFix 1
- 10.2 HotFix 1
- 10.2
- 10.1
- 10.0 HotFix 1
- 10.0

Back Next

Appendix A: Glossary

This section provides glossary of terms.

Account Name: A name field, often referring to account details, which implicitly refers to more than one simple name, for example,
JOHN AND MARY SMITH
See also section
Compound Name
.
Algorithm: A combination of SSA-NAME3 routines that have been generated for a specific Population (example, person names, company names, street names). The Algorithm is accessed through Calls to Services which are linked to it.
Alternate Keys: The Name-key(s) built from different word orders in a name. These can be either Positive or Negative Keys.
Authorization: The process of collecting the signatures and details of the routines linked to an Algorithm. These signatures are checked at run-time to ensure that no ’rogue’ modules have been linked.
Authorized Algorithm: The Algorithm for a Population that is currently available for use by application programs.
Bad key: SSA-NAME3 never returns an unusable Name-key. If for some reason a key could not be generated a special "bad key" is returned. This bad key has a value of 800000HEX, when using 5-byte binary keys, and
K$$$$$$$
when using 8-byte character keys. It can be used to group bad names so that they can be found.

An example would be the name,
THE LIMITED
. If both words in this name are removed by the Edit-list the result is that there is no name to build the key from. In this case the bad key is returned.
Candidates: The set of records returned from a Name
search. For optimum quality these candidates should be passed to the Matching
Service for further qualification before being displayed or otherwise used in a search process.
Cascade: The name given to the Search-table structure built for the most common type of Positive Search strategy. The Search-table starts with the narrowest Name-key range, which could contain the ’searched-for’ record, and continues with progressively widening ranges.
Cleaning: The process of applying character set conversion rules to a Name
with the intention of cleaning and/or converting unwanted characters.
Code-character: Any character marked as a code in character-set table 2. This is normally only the digits 0 - 9.
Code-word: Any token with one of the following attributes:
2 or more code characters,
an initial that is a code character,
1 or 2 characters in length and either preceded or followed by a Code-word.
Common Name: A Name
that occurred often enough in the sample user data provided to the section Frequency Table generation to be considered common.
Compound Name: A Name
field which explicitly refers to more than one simple name, for example,
JOHN SMITH AND GEORGE BROWN
See also section Account Name.
Delimiter: Any character defined as a delimiter in character-set table 4.
Edit-list: A table of user controlled words & phrases that undergo special processing in Name-key
building and Matching
, example, noise words, personal titles, prefixes & suffixes, nicknames, common abbreviations, phrase replacements and Compound Name
markers.
Edit-rule: A line in an Edit-list, for example the line
RR ROB >ROBERT <
is an Edit-rule that says "Replace
ROB
with
ROBERT
".
Fast-start: A collection of sample definition modules for a specific country used to create a first-cut Service Group. The Fast-start definition files are used when first installing SSA-NAME3 as a quick way to test the Installation process and environment. They are later used as the basis for further customization work to make SSA-NAME3 achieve the objectives of the application and end-users.
Filtering: The process of Matching Candidates
to reduce the number of records shown to the user.
Formatting: The process of applying Edit-rules to words, phrases and sub-strings.
Frequency Table: A table generated from an organization’s Name
data holding the most frequently used words that have not been deleted or skipped as a result of Edit-list
processing.
Generation: The process of creating compilable source code modules from Definition files, either on a Windows computer or an MVS system.
Initial: A single character word or the first character of a word.
I+n: Nomenclature for "Initial plus n consonants".
Key Generation: The process of building one or more Name-keys
from the Cleaned, Formatted and Stabilized Tokens
of a Name
.
Major-word: The word in a Name identified as being the most significant. It is used as the primary part of the Preferred Key
, as the primary part of the Search key ranges of a positive search, and for weighting in Matching Schemes
. See also section
Minor-word
.
Major word-key: A 3-byte or 6-byte key generated from the Major-word
of a Name
.
Matching: The process of determining the probability that a search identity and a File entry are the same identity.
Matching Method: The way in which section Matching matches two data items of the same data type. There are methods for names, addresses, dates, strings and codes. (See section
Method
).
Matching Scheme: A definition of the structure of the data items to be matched and the Matching Methods and options to be used.
Method: A routine used for Matching two data items. There are Matching Methods for names, dates, codes and strings. For example, in the following Matching Definition line,
DEFINE METHOD=METHOD1,EP=N3SCL,ALGORITHM=PERSON
The method is
N3SCL
(the name matching method).
Minor-word: Any token in a Name which is not the Major-word and is not a word deleted by an Editrule.
Name: The name of a person, company, business or organization; an address; a product title, song title or book title; any short description. A name consists of a number of words, each with a limit of 24 characters.
Name-key: A compressed five-byte binary or eight-byte character key built from a Name using the NAMESET Service.
Negative Keys: The Name-key(s) built using each non-delete Token (word) in a Name in combination with every other word in the name. See the
Positive Keys
section.
Negative Search Strategy: A method of building a Search-table for a search application whose normal requirement is to prove that a name does not already exists on a database.
Population: Population refers to a class or group of names that requires its own SSA-NAME3 Algorithm. Typical examples of "populations" are: customers, street lines from addresses, song titles, file titles etc.
Positive Keys: The Name-key
(s) built using each non-delete Token
(word) in a Name followed by the other words in a set order. See the
Negative Keys
section.
Positive Search Strategy: A method of building a Search-table
for a search application whose normal requirement is to find a name that already exists on a name database.
Preferred Key: The Name-key
built from the Major-word
followed by the Minor-words
in a Name
.
Probe: A very narrow search range.
Ranking: The process of sorting the Matched section Candidates to show the records to the user in descending order of their likeness to the search identity.
Reliability: The probability that a section Search Strategy will find a name if one exists that should be considered as a match to the search Name
.
Response Code: A unique number indicating the success or otherwise of the Service just called.
Scaler Frequency Table: Atable generated from an organization’s Name
data or an organization’s SSANAME3 keys holding the most frequently occurring Name-keys
. Generation of this table is optional, and it is used to enhance the scale value returned in the NAMESET search ranges. See the
Search Scale
section.
Score: A value between 1 and 100 returned by the Matching Service
to an application. This defines the level of confidence that two candidate records match.
Search Contents: A term used to describe the number of Tokens
(words and Initials
) used in a particular search range.
Search Depth: A term used to describe the width of a Name-key
search range or its Selectivity
.
Search Dialogue: The method by which a search application processes a Search-table and displays the Candidates to the user.
Search Scale: An estimate of the number of Candidates that would be returned using a particular search range.
Search Strategy: The method by which a Search-table is built to achieve the optimum search results for the particular application requirement (e.g. Positive Search Strategy
or Negative Search Strategy
).
Search-table: A table of Name-key
ranges used by a search application to access a Name
database on a Name-key
index. This is the physical implementation of a section Search Strategy.
Selectivity: The percentage of records that are accessed to satisfy the average search.
Service: An SSA-NAME3 function that has been defined and generated for some specific user required purpose and for a specific Population
; e.g. building Name-keys
and Search-tables
; Matching
two records according to a specific set of rules.
Service Group: A collection of SSA-NAME3 Services that are grouped as one program under one name. In Call statements you call a Service Group name requesting a Service, passing parameters according to the service rules.
Service Group Data File: An ASCII text file containing the Service Group "ruleset". It is invoked at runtime by the shared object or dll code.
Service Name: The name used when referring to a Service, e.g. NAMESETP is a name typically used to define a service of type NAMESET for the PERSON Algorithm.
Service Type: The type of a Service
, this defines its functionality whereas the Service Group Data File
An ASCII text file containing the Service Group "ruleset". It is invoked at runtime by the shared object or dll code. Service Name is simply a handle to refer to the Service with, e.g. the Service NAMESETP is of type NAMESET.
Skip-word: Any Token
in a Name
which is defined by an Edit-rule not to take part in Name-key
building.
SSA-NAME3: The latest version of the SSA-NAME3 Algorithms
.
Stabilization (Word Stabilization): The process of applying phonetic and orthographic transformation rules to Name Tokens to stabilize the error and variation.
Suspect Code-word: A word of 3 or more characters that includes 1 Code-character
. See the
Code-word
section.
Token: The individual word components of a section Name after Cleaning
are called Tokens.
Target Platform: The combination of hardware and software that will execute an SSA-NAME3 Algorithm.
Test-bed: An SSA-NAME3 utility which enables quick and reliable testing of Algorithms and Matching Schemes, either interactively or in batch mode. For Microsoft Windows, a Windows based Test-bed can also be used.
Vowel: Any character defined as a vowel in character-set table 4.
Word-key: A 3-byte or 6-byte key generated from any single Token
(word).
Word-type: A one-character code assigned to a Token
by the Formatting
routine. Possible values are:; B - a Suspect Code-word.

C - a Code-word.

D - a deleted word (used only by the TRACE Service)

I - an Initial.

M - the Major-word.

N - the Major-word if it is a Code-word.

S - a Skip-word.

T - a Skip-word if it is a Code-word.

Y - any other word

Rename Saved Search

Table of Contents

Introduction to SSA-NAME3 (EXTN) Service Groups

Introduction to SSA-NAME3 (EXTN) Service Groups

Appendix A: Glossary

Appendix A: Glossary