Table of Contents

Search

  1. Preface
  2. Introduction
  3. Definition File Overview
  4. Customization Steps
  5. Service Group Definition
  6. Algorithm Definition
  7. Edit-list Definition
  8. Matching Scheme Definition

Service Group Definition and Customization Guide

Service Group Definition and Customization Guide

Phrase Definitions

Phrase Definitions

Phrase replacements can be defined as well as the Edit-word rules. The phrase replacement process is similar to that performed by the replacement word (category type R) logic except that phrase replacement can detect and replace multiple words.
A phrase replacement is defined on two consecutive lines. The first line starts with
*P
which is followed by a single space and then the phrase. A phrase has a maximum length of 50 characters. The second line starts with
*R
which is also followed by a single space and a maximum 50-character replacement phrase. These two lines must be specified in this order. A typical phrase replacement definition is as follows:
*P NOT KNOWN *R UNKNOWN
This will replace the phrase
NOT KNOWN
with
UNKNOWN
.
A phrase can be up to 50 characters long with no limit on the number of words, however, each word has a maximum of 24 characters. Each Double Byte Character is translated to 5 bytes internally, therefore when using DBCS a phrase can only be 10 characters in length.
If both the phrase and its replacement are one word then the simple Edit-word replace should be used. However, because the phrase replacement is done before any other Edit-rules are applied, there may be situations where it makes sense to use a phrase replacement even if the phrase is a single word.
Phrases are processed as follows. The input name is broken into words (left-to-right, using BLANK boundaries) and each word is appended to an internal temporary phrase. After each word is added, the temporary phrase is checked to see if it ends in an Edit-list phrase definition. If a match is found, the longest phrase definition which matches the temporary phrase is used and the phrase replacement is made. After the replacement, the internal temporary phrase is re-checked against the Edit-list phrase definitions.
For example, if the Edit-list has the following Phrase definitions:
*P IDENTITY SYSTEMS *R XXX *P UNITED IDENTITY SYSTEMS *R YYY
then the name:
RESEARCH DIVISION UNITED IDENTITY SYSTEMS
would produce the following result:
RESEARCH DIVISION YYY
because
UNITED IDENTITY SYSTEMS
is the longer match of the two entries.
Notice that no
XXX
replacement will ever take place in his case, because once the replacement is made and the name becomes RESEARCH DIVISION YYY, then IDENTITY SYSTEMS no longer exists.
If another Edit-list phrase definition existed as follows:
*P DIVISION YYY *R Y DIVISION
then, when the name
RESEACH DIVISION YYY
is re-processed against the Edit-list phrase definitions, the following result would be produced:
RESEARCH Y DIVISION
Another example illustrates the importance of the left to right precedence. If the following phrase definitions are in the Edit-list:
*P DIVISION YYY *R XXX *P YYY DIVISION *R ZZZ
then an input name of
YYY DIVISION YYY
produces output
ZZZ YYY
as follows:
  1. YYY DIVISION
    in the input name matches
    YYY DIVISION
    in the Edit-list and is replaced with
    ZZZ.
  2. ZZZ YYY
    does not have any Phrase replacement rule.

0 COMMENTS

We’d like to hear from you!