The following paragraphs give an overview of the various SSA-NAME3 Services, what function they perform and in some cases, their basic operation.
Key and Search Strategy Building
The Service used to perform Key and Search Strategy building is called
NAMESET
.
When a name or address is passed to NAMESET, the Algorithm to use is also identified by the Calling program. This Algorithm will cause the passed name or address to be processed in four phases:
Cleaning
This routine uses internal routines and the Algorithm’s Character-set tables to edit the passed string. Common actions include removing special characters and replacing lower-case with upper-case. The Cleaning routine itself is not customizable.
Formatting
This routine uses internal routines and the Algorithm’s Edit-list to edit the passed string into separate words, removing noise, delete or stop words, replacing selected words, concatenating prefix words and other such actions. The Formatting routine itself is not customizable, however, an exit is supplied which may be customized. A working example of this exit is supplied for English style nick-name processing.
Word Stabilization
This routine stabilizes the Cleaned, Formatted words using country specific rules to cater for phonetic and orthographic error. For example de-duping double characters is one common rule. The Stabilization routine is not generally available for customization.
Key and Search Strategy Generation
Builds the keys to be stored in the database, and the key ranges to be used at search time. The type, style and number of keys is controlled by user customizable Algorithm options. The type of Search Strategy is controlled either by
Algorithm
options or by the Calling program.
Matching
The Service used to perform Matching is called
MATCH
.
When two records are passed to MATCH for matching, the name of the Matching Scheme to use is also identified by the Calling program. This Matching Scheme has been pre-defined as part of the customization process and contains the view of the passed records, which Matching Methods are to be used for each field in the record view, and what weights to assign to each field. There are different methods for matching names, address, dates, codes and other strings.
The Methods used for matching names or addresses indicate which Algorithm to use as part of the Matching process. Before two names or two addresses are compared they are first processed through the Cleaning & Formatting processes of the Algorithm as described in the previous section. The Matching Method then uses its own internal processes, which respond to customizable options, to compare the two Cleaned and Formatted strings. The Method may also resort to using the Word Stabilization routine as part of the matching process.
The Matching Methods used for dates, codes and other strings do not use Algorithms. These Methods use their own internal processes, which respond to customizable options, to compare the two fields.
Other SSA-NAME3 Services
The overview of SSA-NAME3 so far has concentrated on its ’core’ services, that is name and address searching, and matching.
SSA-NAME3 also provides other services. These are:
Word-key
Builds a compact key for a user supplied word (as opposed to a ’name’). For example, can be used to build fuzzy keys for text indexing.
Major-word-key
Builds a compact key from the one word in a name which is most suitable for indexing.
Trace
Traces the actions taken by the Formatting process and makes these available back to the Caller. For example, can be used to identify the components of a name or address for such business needs as personalization of marketing letters, or geo-coding of addresses.
Support Routines
Access is provided to the Cleaning, Formatting and Word Stabilization routines directly. For example, names or addresses could be cleaned before they are stored in the database or before printing. The Formatting routine could be called directly to ’tokenize’ a name or address for some other purpose.
Test-bed
A stand-alone test facility which can invoke other SSA-NAME3 services and show the results of the Call.
Browse
Reports internal data and is generally used for debugging purposes.
Info
An API for retrieving information from internal SSA-NAME3 tables.
Debug
Assists in developing the rules for record matching by allowing user programs to dynamically alter Matching rules.