Table of Contents

Search

  1. Preface
  2. Introduction
  3. Definition File Overview
  4. Customization Steps
  5. Service Group Definition
  6. Algorithm Definition
  7. Edit-list Definition
  8. Matching Scheme Definition

Service Group Definition and Customization Guide

Service Group Definition and Customization Guide

Factors Which Determine the Format of the Name

Factors Which Determine the Format of the Name

The way that the name is combined before it is passed to NAMESET or used in MATCHing determines how the following parameter should be set:
NAME-FORMAT=(L or R)
The
NAME-FORMAT
parameter tells SSA-NAME3 where to commonly find the major word in the name field, at the left end or the right end. The major word in a Person name is usually the family name. The major word in a Company Name tends to be the left-most significant word. The major word in an Address is usually the Street name. SSA-NAME3 uses its knowledge of the major word position for positive searches, if only a single key was requested for the name, and to allow weighting of the major word during the matching process. For example,
NAME-FORMAT=R
is a common setting for Western person names.
NAME-FORMAT=L
is a common setting for Chinese names. If your data contains a mixture of name types, and the major word is not in a stable position, then you should use multiple keys, not a single key, and then not bother to weight the major word in matching. SSA-NAME3 uses a tiered approach to discovering the major word in the name, and the
NAME-FORMAT
parameter setting is at the bottom of the tier. Firstly, if SSA-NAME3 finds a comma (,) in the name, the word preceding the comma is taken to be the major word. This comma processing can be turned off by setting
CLEANING-OPTIONS #2
to
Y
. Secondly, Major ’Markers’ can be defined in the Edit-list. These can be in the form of special characters (e.g. / or @) or in the form of words (example, ST, RD, RUE or JALAN for the determination of a Street Name major word). The word adjacent to a Major Marker is taken as the Major Word. For more information on Major Markers refer to the
Edit-list
chapter of this guide, and the
Cleaning
chapter of the
APPLICATION REFERENCE FOR SSA-NAME3 SERVICE GROUPS
guide. If no major word has been identified, then the setting of
NAME-FORMAT
is used. For example, when processing addresses, if
NAME-FORMAT=L
and SSA-NAME3 finds no street name major marker in the address, it will go to the left end of the address and search for the first non-delete, non-skip word. Delete words (such as THE, C/O) and Skip Words (such as APARTMENT) can be defined in the Street Edit-list and are customizable.
Numbers can also be defined as skip words.
NAME-FORMAT=L
would be an appropriate setting if the whole address was passed to NAMESET for key building, so as not to look at the locality end of the address.
NAME-FORMAT=R
would be an appropriate setting if only the part of the address preceding the locality/state was passed to NAMESET for key-building, in case passed string contained no Street Type.

0 COMMENTS

We’d like to hear from you!