Service Group Application Reference

Service Group Application Reference

Operation

Operation

Unlike the Formatting Service, when a name or address is passed to TRACE, it is first passed to the Cleaning Service. However, unlike the
NAMESET
Service, only early Cleaning is performed, not the full Cleaning.
The Early Cleaning will perform the following actions:
  • Replace delimiters (as defined by the Character-set Table 14) with blanks.
  • Reduce multiple blanks to one blank.
  • Identify and remove Major Markers (as defined in the Edit-list) and perform any name reordering required.
After the input name or address has been cleaned the CLEANED NAME is passed to the Formatting Service, using a special internal Function call.
This special Formatting Call produces what is known as an enhanced Words-stack. This is detailed from a program point of view in the section on Parameters.
The following is a typical
TRACE
output (produced by the Test-bed).
INPUT: mr jim robert gray the 1st OUTPUT: mr jim robert gray the 1st STACK: 07 000 000 05 1 MR D D 000 001 PT 2 JIM D N 003 005 NK 3 JAMES Y Y C 10 C 03 003 005 4 ROBERT Y Y U 00 C 17 007 012 5 GRAY M Y C 00 U 00 014 017 6 THE D D 019 021 NW 7 1ST C B U 00 U 00 023 025
This output is more fully described below.
INPUT:
mr jim robert gray the 1st
OUTPUT:
mr jim robert gray the 1st
Although the
TRACE
Service performs some cleaning on the input name it is only the "Early Cleaning" that is executed (characters replaced according to Character-set Table 14). Therefore, in this example no upper-casing was performed.
STACK:
0
7 000 000 05
This is the Word-stack header.
07
- A count indicating the number of words in the following Words-stack.
000 000
- Location of major marker or markers in original name. If you have Major markers defined in your Edit-list and they occurred in the name these values will tell you where they were in the pre-cleaned name.
If the markers were of type left, right, head or tail there will only be one offset. With marker type ’delimiter’ the first value is the offset of the opening marker and the second that of the closing marker.
05
- Index of major word. This indicates which entry in the Words-stack is that of the Major word, as selected by the Formatting.
1 MR
D D 000 001 PT
- This is a typical personal title, fields that are of interest are as follows,
D
- The Word-type, as decided by the Formatting after applying any Edit-list rules. In this case the
D
indicates that the word would have been deleted during the normal course of the Formatting.
D
- TheWord-type, as decided by the Formatting before applying any Edit rules. In this example, the D comes from the following Category definition in the Edit-list:
*C PT D PERSONAL TITLE PT MR > <
000 001
- These two numbers are the offsets, within the name, of the first and last characters in this word. Position 1 is at offset 0.
PT
- The category name used to define this word in the Edit-list.
2 JIM
D N 003 005 NK
- This is similar to the previous line except that the word is defined as a nickname in the Edit-list. In this example the Edit-list probably had a rule like,
*C NK D NICKNAME NK JIM >JAMES <
which, with normal Formatting, would cause the
JIM
to be replaced with JAMES.
TRACE
also does the replacement but keeps the original word and marks it as a deleted word.
3 JAMES
Y Y C 10 C 03 003 005
- Here we have our first real word to survive the Formatting.
Y
- The Word-type, as decided by Formatting after applying any Edit rules.
Y
- The Word-type, as decided by the Formatting before applying Formatting rules. As this word was neither deleted nor had its Word-type changed by Formatting this is simply a duplicate of the first Y.
C
- Common or Uncommon Major word. If this word is a major word, then a C in this column indicates that the word was a common major word, a
U
means uncommon major word.
10
- Scale as a Major word. In this case a scale of
10
indicates that the word JAMES had a count of less than 10 in the Major word table. Note that this seemingly obvious translation of a 10 scale to a 10 count is misleading, this is a logarithmic scale that happens have a 1:1 ratio with the value 10. For more information on how the scale is calculated read the
NAMESET/ Parameters
section earlier in this manual.
C
- Common or Uncommon Minor word. If this word is a minor word, then a
C
in this column indicates that the word was a common minor word, a U means uncommon minor word.
03
- Scale as a Minor word. The word JAMES occurred less than 2 times in the common Minor word table.
003 005
- Starting and ending position of word.
4 ROBERT
Y Y U 00 C 17 007 012
- A normal word, flagged as being a minor word (
Y
), uncommon Major (
U
) and common Minor (
C
).
5 GRAY
M Y C 00 U 00 014 017
- The Major word. Before Formatting rules were applied it was flagged as a Minor or possible Major word (
Y
). However, after the Formatting rules the word was identified as a Major word (
M
).
6 THE
D D 019 021 NW
- A word defined as a noise word in the Edit-list.
7 1ST
C B U 00 U 00 023 025
- A Suspect Code-word (
B
) was determined to be a Codeword (
C
) after the Formatting rules were applied.

0 COMMENTS

We’d like to hear from you!