Introduction to SSA-NAME3 User Guide

Introduction to SSA-NAME3 User Guide

Problems Addressed by SSA-NAME3

Problems Addressed by SSA-NAME3

For the typical application SSA-NAME3 addresses:
  • Errors made in spelling the spoken word.
  • Transcription and keying errors for written names and codes.
  • Missing words, initials, numbers or codes.
  • Mixed usage of first names and initials.
  • Mixed usage of 1, 2, etc. with. .. one, two . . . 1st, 2nd, .. First, Second, etc.
  • Nicknames, formal and informal abbreviations, synonyms, language variation of common words.
  • Concatenation or splitting of words and codes.
  • Extra words and word sequence variations.
  • Presence of irrelevant "noise words" in the data.
  • Missing or "null" data.
  • Presence of "foreign" names and addresses.
  • Failures to find all parts of compound or account names where multiple entities are present in one name field.
  • Anglicization (Localization) of names causing variation between formal name, as on a Passport or Driver’s License, and less formal names on other transactions.
  • The problems created by the frequent use of certain common last and first names, or use of common words and numbers in organization names and addresses.
  • The fact that many names can be made from title or "noise" words. For example, Sister J Bishop, The Limited, The Company Inc.
For the higher volume or more sophisticated system, SSA-NAME3 also addresses the following issues:
  • Length of response time of the system before an answer is available.
  • The need to balance performance and quality.
  • The problem that weaker name search approaches either miss relevant names, or conversely show far too many names for the user to be able to make relevant choices.
  • That certain common nicknames and synonyms are better handled by "secondary searching". For example, Tina can be Chris but not Christopher, yet Chris can be Christopher.
  • The design of dialogues so that neither the operator nor the system comes too quickly to the conclusion that there is not a relevant match. That is the volume can cause data to be missed.
  • That increasing the width of the search to discover records with a larger amount of variation significantly aggravates the response time and performance.
  • That progressive refinement of the system by addressing special cases introduces undiscovered problems elsewhere and progressively degrades the system.
  • That mixing data from different systems in an integrated search leads to new error and frustration for users because of variations in the name handling between the source systems.
  • That ad hoc changes to a name search system may improve specific transaction quality yet, because re-testing is not exhaustive, can cause previously satisfied cases to fail.