Identity Resolution
- Identity Resolution 10.5 HotFix 3
- All Products
SEARCH: JOHN ALLEN ANDERSON FILE: JOHN ANDERSON ALLEN
In N3SCL,*C NN R Nick-names NN JONATHON >JOHN < OPTION CATSW VALUE NN,9 SEARCH: JOHN SMITH FILE: JONATHON SMITH
Option
| Description
| Example
|
|---|---|---|
OPTION SCORES
VALUE INIT,[number]
| This option controls how an Initial will match against the first character of a word.
[number]
is a value between 0 and 10, where 0 means attribute a 0/10 Score if the Initial matches the first character of the word and 10 means attribute a 10/10 Score if the Initial matches. If
SCALEFTR ,1 is specified,
[number]
can be between 0 and 100. If an Edit-list nickname rule has been defined, for example to replace Bill with William, W. Smith would still match Bill Smith. If this option is omitted, an initial will be compared to a full word using a string comparison and if it matches, will be awarded a Score of 3/10.
|
|
LOPT=(INITLOW)
| The default Score for an Initial matching the first character of a word is 3/10. With the
INIT option (described above), it is possible to raise this Score to a maximum of 10/10 and the
INIT value, by default, is applied to all cases where an Initial matches the first character of a word. In cases where the non-initial words do not match, however, it may be desirable to reduce the value of the Initial/Word Score, say, for example when two family names do not match, but the given Initial of one still matches the given name of the other. The
INITLOW option reduces the significance of initials if all of the noninitial words do not match. The SCORE in such cases is reduced to the default value of 3/10. Provided at least one of the non-initial words match,
INITLOW will not be applied. For example, with VALUE INIT,10 specified, G N HOLLOWAY will match GREG NORMAN HALL with a Score of 076. Using the
INITLOW option the Score is reduced to 030. If there is an exact match between any words in the name the processing of
INITLOW is disabled.
|
|
OPTION FLAGS
VALUE INITCODE,{0/1}
| This option is used to prevent Initials being compared with Words when either is a code.
This is used to prevent a high Score being returned in the case
where INIT is also used. A value of
0 turns the option on (i.e. prevents
Initials being matched with Words when either is a code), a
value of 1 turns the option off. The default is off. For
example, with INIT,10, 1 and 176 will Score 3/10. With the
INITCODE,0 specified, the comparison will get a Score of 0/10. | |
OPTION FLAGS
VALUE EXACTWRD,{0/1}
VALUE EXACTINI,{0/1}
| With
EXACTWRD ,1 and
EXACTINI ,1 exact initial to initial matches will be retained, regardless of whether a better Score may have been achieved by matching the initial to a word. For example,
GRIFFIN, JOHN W J
GRIFFIN, JAMES W J
with
EXACTWRD ,0 and
EXACTINI ,0 (the default), and VALUE INIT ,10 , will score 100, because the initial
J in each name matches exactly with the words John and James respectively. With
EXACTWRD ,1 and
EXACTINI ,1
the Score would be lower, e.g. 080, because John and James are not as good a match.
EXACTINI ,1 requires
EXACTWRD ,1 before it will function.
| |
OPTION FLAGS
VALUE EXACTMCH,{0/1}
| If two records match exactly then a Score of 100 is immediately given, bypassing Formatting. This is not always desirable, for example, in cases where an Edit List rule should be used prior to Matching.
The default is
EXACTMCH ,1 which will result in an early exact match check. Changing to
EXACTMCH,0 switches off exact match check. For example, The following Edit List rules are defined:
With EXACTMCH,1
With EXACTMCH,0
| |
OPTION FLAGS
VALUE SKIPMTCH,{0/1}
| Usually an initial will not match a skip word, using
SKIPMTCH,1 will allow such a match.
SKIPMTCH,0 is the default. For example if University and Technology are skip words:
| With
INIT,10
With
INIT,10 & SKIPMTCH,1
|
OPTION FLAGS
VALUE OPTIMILW,{0/1}
| The default is
OPTIMILW,1 . When
INITLOW is active and it reduces an initial/word Score, a check is done to see if a better word match can be found. If one can, it is used instead of the degraded original match.
To turn off this optimization, use
OPTIMILW,0 . Comparing these two names for example,
With
INITLOW ,
INIT,10 and
OPTIMILW,0 , the Score returned would be 030. This occurs because of two things.
INIT,10 causes
P / PETER to score 10/10 and to be chosen for the match over
PETER / PETERS, and
INITLOW takes effect on the
P/ PETER
match because the
PETER / PETERS pair was not a match, thus decreasing the Score to 030. With
INITLOW ,
INIT,10 and
OPTIMILW ,1, the Score returned would be 080, because a check is done to see if a better word match can be found, in this case the Score of the
PETER / PETERS pair.
| |
OPTION FLAGS
VALUE ILOWWRDS,{0/1}
| This option is used in conjunction with
INITLOW to reduce the score for an initial-to-word match (to 3/10) if there are any unmatched words between the two names. To turn it on, specify
ILOWWRDS,1 .
The default is
ILOWWRDS,0 . For example, without
ILOWWRDS,1
(and assuming
REFMIN and
INIT,9 ):
| With
ILOWWRDS,1 :
|
OPTION FLAGS VALUE ILWWRDFG,{0/1} | This option modifies the logic of
ILOWWRDS processing. It doesn't reduce the
score for an initial-to-word match even if there are any unmatched
words between two names. To turn it on, specify
ILWWRDFG,1 .Default is
ILWWRDFG,0 . | For example, when you compare two names, such as M
Anderson and Michael Wayne
Anderson and set the option flag as
ILWWRDFG, 1 , the logic recognizes
M corresponds to Michael .
And, the word Wayne as an unmatched middle name
doesn't reduce the match score. |
OPTION FLAGS VALUE ILWWRDSM,{0/1} | This option modifies a match score percentage when a major word in a
name matches only the initial of the corresponding word in another
name. To turn it on, specify ILWWRDSM,1 .Default
is ILWWRDSM,0 . You can use this option
to assign a partial score. To enable
this option, ensure that you set the
ILWWRDFG and ILOWWRDS
options to 1. | When you compare two names, such as Pablo Algarte and Pablo A
Provoto , the word Algarte is
compared to the initial A . If the values
matched at a score of 90, ILWWRDSM,60 sets the
score to 60. |
OPTION SCORES
VALUE ILOWTRIG,[number]
| This option controls the value for a word Score to be considered a match by the
INITLOW processing.
The default is 10, i.e. if an initial / word match is present and two other words do not match 10/10,
INITLOW processing will take place. Changing the value to 8 (as an example) will prevent
INITLOW degrading the Score of an initial / word match when two other words are considered a reasonable match (in this case 8 / 10). If
SCALEFTR,1 is specified,
[number] can be between 0 and 100. For example, with options:
With the additional option:
the Score becomes 090 because the J / JOHN
match is not effected by
INITLOW . This is because the
SMITH / SNITH match is 8/10 and the
ILOWTRIG option causes
INITLOW processing to be bypassed.
| |
LOPT=(ABBRMIN)
| ABBRMIN sets the minimum length of an abbreviation that can match. For example assuming
ABBRMIN*3
is specified. If a word of length 3 or more matches the beginning of another (longer) word, the Score specified with the
ABBRSCR option is returned. In other words the short word is an abbreviation of the long word. Using the
ABBRSCR example, ROBE --> ROBERT matches ROB --> ROBERT matches ROBIN --> ROBERT doesn’t match Note that the shorter of the two words must still be a 100% match with the beginning of the longer word for this logic to be invoked. matches ROBIN --> ROBERT doesn’t match Note that the shorter of the two words
| |
LOPT=(ABBRSCR)
| Sets the Score for an abbreviated match, e.g.. 8 = 80%, 10 = 100%. When two words match according to the
ABBRMIN rules the Score specified here is returned for the match on the two words. For example, 1. With no
ABBRMIN or
ABBRSCR
With
LOPT=(ABBRMIN*3+ABBRSCR*10)
| |
OPTION FLAGS
VALUE ABBSCRCL, [number]
| Specifies whether to apply penalty when using the
ABBRSCR option.
| When matching the words BOW and BOWES in the following example, with
OPTION FLAGS VALUE ABBSCRCL, 5 , the score reduces by 10.
Total penalty = Number of excess characters × Penalty value
10 = 2 × 5
|
OPTION FLAGS
VALUE FMTINIT,{0,1,2,3,4,5,6}
| FORMATTING-OPTIONS #9 controls how Formatting treats a run of two or more initials. If it is set to a value other than ’N’, initials will be concatenated. This is the normal behavior for company and mixed company/person algorithms. This is important for keys and search strategies so that, for example, ABC HOLDINGS is able to successfully find A.B.C. HOLDINGS. Formatting options also affect matching in that a name is processed through Formatting prior to being matched. This behavior, however, may be undesirable in cases such as when a search for J W SMITH finds JOHN SMITH. The two formatted names that get compared would be JW SMITH and JOHN SMITH and the JW and JOHN do not match well. Use one of the following values:
For more information about the
FORMATTING-OPTIONS #9 , see
Module Options.
| |
OPTION CONCINIT
VALUE THRSHOLD,[Score]
VALUE MININIT,[number]
VALUE MAXINIT,[number]
VALUE ALLWSKIP,{0/1}
VALUE SCORE,[Score]
VALUE PENALTY,[number]
VALUE NORSCORE,{0/1}
VALUE PARTMTCH,{0/1}
VALUE SKIPGOOD,{0,1}
| The
CONCINIT option allows matching of acronyms to full names. For example:
An acronym may be retrieved as a candidate in a search by using the INITPROBE or
INITRANGE
NAMESET function keywords. An acronym and full name may also become a search and file record in matching because of a search on another field (e.g. address). Acronym matching, if done, takes place at the end of the matching process, after an original Score has been computed. Acronym matching will only be attempted if the original Score is below the
THRSHOLD value. The default threshold score value is 80. The
MININIT and
MAXINIT values set the minimum and maximum number of words in the full name that can be matched to the acronym (starting from the left). For example, it would be typical to set
MININIT at 3 (the default) because most acronyms start at three words. A reasonable
MAXINIT value would be 8 (the default). By default, Skip Words are allowed to participate in acronym matching. Skip Words can be disallowed in acronym matching by setting
ALLWSKIP to 0. By default, a successful acronym match will return a Score of 100. It may be desirable to set the maximum Score lower. This can be achieved with the
SCORE value setting. Using the
PENALTY value, it is possible to decrement the acronym Score by the number of excess words in the non-reference record. If
PENALTY is omitted, no penalty is applied for excess words. By default, the acronym Score is returned only if it is greater than the original Score. By setting
NORSCORE to 0, the acronym Score is returned whether it is greater or lesser than the original Score. For looser matching, specify
PARTMTCH,1 . This allows part of the acronym to match and a score to be computed relative to the number of initials that matched. For example,
will score 66 if
PARTMTCH,1 is specified. 0, the default, does not allow part acronym matching and the Score would be 0. By default, words that match 100% are included in the
CONCINIT rescore. By setting
SKIPGOOD to 1, words that match 100% are excluded from the
CONCINIT rescore.
| |
OPTIONS FLAGS VALUE REMPFXWD,{0,1}
| Indicates whether to remove the prefix of a word in the stack. Set the value to 1 to remove the prefix of the word. Default is 0.
| For example, if the word is coolpixp6000bk and you add
OPTION FLAGS VALUE REMPFXWD,1 , it removes the prefix coolpix and generates p6000bk.
|
OPTIONS SCORES VALUE RAWINITP,30 | Indicates whether to apply a penalty when initials and trailing
characters don't match. Default is 0. The penalty is applied only
when the following conditions are met:
| For example, consider the following names:
A penalty is applied for CARLOS and RICARDO because their initial characters don't match
and their trailing characters too don't match. No penalty is applied for DEANRADE and ANDRADE. Even though their
initial characters don't match, their trailing characters
match. Similarly, no penalty is applied to the words MEE and LEE because
the tail part LEE is the same in both words. However, if you
compare MLEE and LEED, a penalty is applied because the tail
parts differ and the initial characters do not match. |
Option
| Description
| Example
|
|---|---|---|
LOPT=(CONC)
| Allow concatenated matches. This option allows concatenated words to match against separate words. For example, when matching,
with
The
HACKFORTH JONES will match to produce a total Score of 100% with the
CONC option. Without it a Score of 75% is returned.
| |
LOPT=(CINITM)
| Allow multiple concatenations. This option allows the concatenation of more than two words. It requires that
CONC is also specified.
| For example,
|
LOPT=(CINITI)
| Allow concatenation of initials. Requires that
CONC is also specified.
| |
LOPT=(CINITA)
| Allow both initials and multiple concatenations. Shorthand for specifying both
CINITI and
CINITM . Requires that
CONC is also specified.
The syntax is:
| |
OPTION CONCAT
VALUE PLURALS,{0/1}
VALUE RAW,{0/1}
VALUE SCORE,[maximum Score]
VALUE THRSHOLD,[threshold Score]
VALUE ORIGWORD,{0/1}
VALUE WNUMBER,{0/1}
| By setting PLURALS to 1, a trailing S on one of the two words/concatenated words will match 100%. Default value of
PLURALS is 0. Setting
RAW to 1 will perform a raw compare and accept the match if it is above the
threshold Score .
|
Option
| Description
|
|---|---|
LOPT=(NOORDER)
| Normally, any Scores over 75 are degraded by 1 for each out-of-order word pair (or by larger amounts if
OPTION ORDER is used). This option disables that feature.
|
OPTION ORDER
VALUE POS,[number]
VALUE SEQ,[number]
VALUE TRIGGER,[number]
| Normally any Scores over 75 will cause out-of-order word checking to be enabled. Default out-oforder word checking will decrement a Score by 1 for each out-of-order word pair. This processing can be turned off with the
NOORDER option. To change the default trigger Score of 75, use the
TRIGGER option. Out-of-order means either out-of-position or out-of-sequence. To explain the meaning of outof- position and out-of-sequence, refer to the following example. The following two names have words out of position (SMITH vs ALAN), but not out of sequence (SMITH follows JOHN in both cases),
If the default out-of-order processing is used (i.e. no
NOORDER and no
OPTION ORDER ), and assuming
REFMIN is also used, these two names will score 99. If it is desired to only decrement the Score if the names are either out-of-position or out-of-sequence, use the
VALUE POS or
VALUE SEQ options. These options are mutually exclusive. Use the
VALUE POS option to specify a value (between 0 and 100) by which to decrement the Score for each word out-of-position. Use the
VALUE SEQ option to specify a value (between 0 and 100) by which to decrement the Score for each word out-of-sequence.
|
OPTION ORDER
VALUE PER,[penalty]
VALUE PERFLAG,{0,1,2}
| Specifying
VALUE PER,n causes an additional check of the first and last words in the two names to be performed. If the two words are different then penalty
n is applied to the score. E.g: Not using
VALUE PER,n
Using
VALUE PER,1
In addition to
VALUE PER ,n ,
VALUE
PERFLAG ,m may be specified. Note that this option has an effect only where one of the two word stacks contains a single word. In these cases, the value of
m modifies the behavior as follows:
Always apply the penalty. This is the default.
Ensure that the matching word is the first in each stack before applying the penalty. Use the
NAME-FORMAT setting to determine the meaning of first. i.e If
NAME-FORMAT=L , then the matching word must be the leftmost words. If
NAME-FORMAT=R , then the matching word must be the rightmost words.
Ensure that the matching word is the first in each stack before applying the penalty, irrespective of the
NAME-FORMAT setting. i.e. the matching word must be the leftmost words. In the case where both names contain a single word, then this option has no effect.
|