Identity Resolution
- Identity Resolution 10.1 HotFix 1
- All Products
Option
| Description
| Syntax
|
---|---|---|
ABBRMIN
| Sets the minimum length for an abbreviated match.
Refer to the
Local Options Addressing Truncation & Initials
section for further information.
| LOPT=
|
ABBRSCR
| Sets the Score for an abbreviated match. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| LOPT=
|
ALLWSKIP
| Allow Skip words when matching to an acronym.
Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION CONCINIT
|
AVERAGE
| Calculates the average of the original and recalculated Score when
CLIMIT logic is activated. Refer to the
Local Options Addressing Long Names or Addresses section for further information.
| OPTION CLIMIT
|
CATSWD
| When using
CATSW or
CATSS , this option disables
CATSW and
CATSS processing when an Initial to Word match is being processed and the Word is in a
CATSW or
CATSS category. Refer to the
Local Options Addressing Word Type
section for further information.
| OPTION FLAGS
|
CATSWEXT
| When using
CATSW or
CATSS , this option enables a 100% match if the word pair was the same before editing. Refer to the
Local Options Addressing Word Type
section for further information.
| OPTION FLAGS
|
CATSWF
| When using
CATSW or
CATSS , this option forces
CATSW and
CATSS processing to be performed even if
MAJMOD processing is done. Refer to the
Local Options Addressing Word Type
section for further information.
| OPTION FLAGS
|
CHARDS
| Disable
CLIMIT logic for words which Score above the
CHARDS limit. Refer to the
Local Options Addressing Long Names or Addresses section for further information.
| OPTION CLIMIT
|
CINITA
| Allow both initial and multiple concatenations. Refer to the
Local Options Addressing Concatenation
section for further information.
| LOPT=
|
CINITI
| Allow concatenation of initials. Refer to the
Local Options Addressing Concatenation section for further information.
| LOPT=
|
CINITM
| Allow multiple concatenations. Refer to the
Local Options Addressing Concatenation section for further information.
| LOPT-
|
CLN
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes
section for further information.
| OPTION SORTSCOR
|
CODEMAXD
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION CODESCOR
|
CODEPOSS
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION CODESCOR
|
CODEWGHT
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION CODESCOR
|
CODEUDIF
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes
section for further information.
| OPTION CODESCOR
|
CODEUONE
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION CODESCOR
|
CONC
| Allow concatenated matches. Refer to the
Local Options Addressing Concatenation section for further information.
| LOPT=
|
EXACTCAT
| Ignores
CATSW and
CATSS option if an exact match exists after formatting. Refer to the
Local Options Addressing Word Type section for further information.
| OPTION FLAGS
|
EXACTWRD
| Causes exact word matches to be retained and not optimized. Refer to the
Local Options Addressing Word Type and Local Options Addressing Truncation & Initials sections for further information.
| OPTION FLAGS
|
EXACTINI
| Causes exact initial matches to be retained and not optimized. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION FLAGS
|
EXACTMCH
| Switches off early exact match check. Refer to the
Local Options Addressing Truncation & Initials
section for further information.
| OPTION FLAGS
|
EXCTCODE
| Codes must match exactly. Refer to the
Local Options Addressing Word Type
section for further information.
| XOPT=
|
FMT
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION SORTSCOR
|
FMTINIT
| Turn off
FORMATTING-OPTIONS #9 , concatenation of initials. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION FLAGS
|
GOOD
| Specifies the Word Score that needs to be achieved for a user-defined Score to be returned. Refer to the
Local Options Controlling Reference Record Matching section for further information.
| OPTION REFN
|
ILOWTRIG
| Controls the value for a word Score to be considered a match by
INITLOW processing. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION SCORES
|
ILOWWRDS
| Lowers the Score if one or more words match and there are no initial matches. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION FLAGS
|
INIT
| Controls how an initial matches against the first letter of a word. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION SCORES
|
INITCODE
| Prevents codes acting as initials. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION FLAGS
|
INITLOW
| Disables
INIT when the non-initial words do not match. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| LOPT=
|
LIMWCAT
| Allows the maximum Weight of a word defined by
CATSW to be less than 10. Refer to the
Local Options Addressing Word Type section for further information.
| OPTION FLAGS
|
MAJMOD
| Allows the Score for major word matches to be increased or decreased. Refer to the
Local Options Addressing Word Order section for further information.
| LOPT=
|
MATCHEND
| Allows a string match (raw compare) to resync at the last character. Refer to the
Local Options Addressing Spelling
section for further information.
| OPTION FLAGS
|
MAXINIT
| Maximum number of words allowed when matching to an acronym. Refer to the
Local Options Addressing Truncation & Initials
section for further information.
| OPTION CONCINIT
|
MININIT
| Minimum number of words allowed when matching to an acronym. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION CONCINIT
|
MOVEMNT
| Allows finer control over
MAJMOD . Refer to the
Local Options Addressing Word Type
section for further information.
| OPTION MAJMOD
|
NGRAMC
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes
section for further information.
| OPTION SORTSCOR
|
NGRAMCLV
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION SORTSCOR
|
NGRAMF
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION SORTSCOR
|
NGRAMFLV
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION SORTSCOR
|
NOEXCLST
| Specify a list of Word-types to be used in
NOEXCESS processing to give improved matching on concatenated words. Refer to the
Local Options Controlling Reference Record Matching
section for further information.
| OPTION NOEXCLST
|
NOEXPNTY
| In
CLIMIT processing, reduce the score by a value dependent on the difference in the number of tokens between the two
CLIMLIST stacks. Refer to the
Local Options Addressing Long Names or Addresses
section for further information.
| OPTION CLIMIT
|
NOINCR
| Use the original Score if the new Score is greater than the original Score when
CLIMIT logic is activated. Refer to the
Local Options Addressing Long Names or Addresses section for further information.
| OPTION CLIMIT
|
NOORDER
| Disable Score reduction when words match out of order. Refer to the
Local Options Addressing Word Order
section for further information.
| LOPT=
|
NORAW
| Disable raw string matching. Refer to the
Local Options Addressing Spelling
section for further information.
| LOPT=
|
NORSCORE
| Controls whether a Score higher than the original Score can be returned from an acronym match. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION CONCINIT
|
NOSTD
| Disable stabilized word matching. Refer to the
Local Options Addressing Spelling
section for further information.
| LOPT=
|
NSACTF
| Controls what action to take when
CLIMIT logic is activated and no new Score can be achieved. Refer to the
Local Options Addressing Long Names or Addresses
section for further information.
| OPTION CLIMIT
|
OPTIMILW
| Switches off initial/word matching optimization when using
INITLOW . Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION FLAGS
|
ORIGWORD
| Option allows comparing the original word as well as edit list replacement. Refer to the
Local Options Addressing Concatenation section for further information.
| OPTION CONCAT
|
ORIGWSCR
| Allows the Score to be re-calculated on each unformatted word and sets the maximum Score. Refer to the
Local Options Addressing Spelling
section for further information.
| OPTION SCORES
|
ORIGWTHR
| Score threshold below which
ORIGWSCR matching is allowed. Refer to the
Local Options Addressing Spelling section for further information.
| OPTION SCORES
|
PARTMTCH
| This allows part of the acronym to match and a score to be computed relative to the number of initials that matched. Refer to the
Local Options Addressing Truncation & Initials
section for further information.
| OPTION CONCINIT
|
PENALTY
| Decrements the Score by the
PENALTY value for excess words when using the
CONCINIT option. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION CONCINIT
|
PENALTY
| Decrements the Score by the
PENALTY value for excess words when using
NOEXCESS option. Refer to the
Local Options Controlling Reference Record Matching
section for further information.
| OPTION NOEXCESS
|
PER
| Compares first and last words and applies the specified penalty if they are different. Refer to the
Local Options Addressing Word Order
section for further information.
| OPTION ORDER
|
PERFLAG
| Used to apply finer control to the
PER option (above). Refer to the
Local Options Addressing Word Order section for further information.
| OPTION ORDER
|
PLURALS
| Allows for a trailing ’S’ on one of the pair of words so as to match 100%. Refer to the
Local Options Addressing Concatenation section for further information.
| OPTION CONCAT
|
POS
| Set the decrement for out of position processing. Refer to the
Local Options Addressing Word Order
section for further information.
| OPTION ORDER
|
RAW
| Performs a raw compare of concatenated words. Refer to the
Local Options Addressing Concatenation section for further information.
| OPTION CONCAT
|
RAWCMPTN
| Causes raw string compares to be calculated out of 100 instead of 10. Refer to the
Local Options Addressing Spelling section for further information.
| OPTION FLAGS
|
RAWSTBTH
| Increases the score by a factor based on the raw and stabilized scores. Refer to the
Local Options Addressing Spelling
section for further information.
| OPTION SCORES
|
RAWSTBVL
| Value used in the
RAWSTBTH calculation. Refer to the
Local Options Addressing Spelling
section for further information.
| OPTION SCORES
|
RECREF
| Causes re-calculation of
REFMIN/REFMAX based on the word types in
CLIMLIST . Refer to the
Local Options Addressing Long Names or Addresses section for further information.
| OPTION CLIMIT
|
REFCNT
| Specifies the number of words that must be present to trigger a bonus penalty to be applied via
REFMULT . Refer to the
Local Options Controlling Reference Record Matching
section for further information.
| OPTION NOEXCESS
|
REFF
| Applies
REFMULT logic only if the file record meets the
REFCNT condition. Refer to the
Local Options Controlling Reference Record Matching
section for further information.
| OPTION NOEXCESS
|
REFMULT
| A multiplier for the
NOEXCESS PENALTY when
REFCNT &
REFF/REFS conditions are met. Refer to the
Local Options Controlling Reference Record Matching
section for further information.
| OPTION NOEXCESS
|
REFS
| Applies
REFMULT logic only if the search record meets the
REFCNT condition. Refer to the
Local Options Controlling Reference Record Matching section for further information.
| OPTION NOEXCESS
|
SCALEFTR
| Changes the word score scale to be out of 100 instead of out of 10. Refer to the
Local Options Controlling Word Score
section for further information.
| OPTION FLAGS
|
SCORE
| Sets the maximum Score allowed for a concatenated match. Refer to the
Local Options Addressing Concatenation
section for further information.
| OPTION CONCAT
|
SCORE
| Sets the maximum Score allowed for an acronym match. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION CONCINIT
|
SCORE
| Controls the Score returned by
MAJMOD when the major word in one name matches well against any word in the other name. Refer to the
Local Options Addressing Word Type
section for further information.
| OPTION MAJMOD
|
SCORE
| Specifies the user-defined Score to be returned when using the
REFN option. Refer to the
Local Options Controlling Reference Record Matching
section for further information.
| OPTION REFN
|
SEC
| Allows a user-defined Score for Secondary Word matches. Refer to the
Local Options Addressing Multi-valued Fields section for further information.
| OPTION SCORES
|
SECOND
| Specifies which types of Secondary names should be expanded for matching. Refer to the
Local Options Addressing Multi-valued Fields section for further information.
| OPTION FLAGS
|
SECPHRSE
| Creates secondary phrase names. 0 is the default (off), 1 turns the feature on and 2 creates all secondary names. Refer to the
Local Options Addressing Multi-valued Fields
section for further information.
| OPTION FLAGS
|
SECPHRSE
| Allows a user-defined Score for secondary phrase matches. Refer to the
Local Options Addressing Multi-valued Fields
section for further information.
| OPTION SCORES
|
SECPORIG
| If this is set to 1, then include original names before secondary phrase rules are applied. Default is 0. Refer to the
Local Options Addressing Multivalued Fields section for further information.
| OPTION FLAGS
|
SEQ
| Set the decrement for out of sequence processing. Refer to the
Local Options Addressing Word Order section for further information.
| OPTION ORDER
|
SKIPCONS
| Matches multiple consonants with a single consonant. For more information about SKIPCONS, see the Local Options Addressing Spelling topic.
| OPTION FLAGS
|
SKIPGOOD
| If this is set to 1 then words that match 100% are excluded from the
CONCINIT rescore. Default is 0. Refer to the
Local Options Addressing Truncation & Initials
section for further information.
| OPTION CONCINIT
|
SKIPMAJM
| Allows an exact match early-exit even when
MAJMOD is specified. Refer to the
Local Options Addressing Word Type s ection for further information.
| OPTION FLAGS
|
SKIPMTCH
| Allows an initial to match a skip word. Refer to the
Local Options Addressing Truncation & Initials section for further information.
| OPTION FLAGS
|
SKIPMOD
| Allows the Score for skip word matches to be increased or decreased. Refer to the
Local Options Addressing Word Type
section for further information.
| XOPT=
|
SKIPSMOD
| Allows an exact match early-exit even when
SKIPMOD is specified. Refer to the
Local Options Addressing Word Type section for further information.
| OPTION FLAGS
|
SKIPVOWL
| Matches vowels or ignores a vowel when compared with a consonant. For more information about SKIPVOWL, see the Local Options Addressing Spelling topic.
| OPTION FLAGS
|
SORTWGHT
| Used by specialised code scoring option. Refer to the
Local Options Addressing Matching of Codes section for further information.
| OPTION SORTSCOR
|
SREFCNT
| Specifies the number of skip words that must be present to trigger a bonus penalty to be applied via
SREFMULT . Refer to the
Local Options Controlling Reference Record Matching section for further information.
| OPTION NOEXCESS
|
SREFMULT
| A multiplier for the
NOEXCESS PENALTY when
SREFCNT &
REFF/REFS
conditions are met. Refer to the
Local Options Controlling Reference Record Matching
section for further information.
| OPTION NOEXCESS
|
STD
| Allows the Score for a stabilized word match to be increased or decreased. Refer to the
Local Options Addressing Spelling section for further information.
| OPTION SCORES
|
SYNCS
| Sets the minimum number of characters which must match to enable re-synchronization in a raw string comparison. Refer to the
Local Options Addressing Spelling section for further information.
| OPTION FLAGS
|
THRSHOLD
| Sets the level at which to accept a concatenated match. Refer to the
Local Options Addressing Concatenation section for further information.
| OPTION CONCAT
|
THRSHOLD
| Score threshold below which acronym matching is allowed. Refer to the
Local Options Addressing Truncation & Initials
section for further information.
| OPTION CONCINIT
|
THRSHOLD
| Sets the threshold Score above which
MAJMOD processing will take place. Refer to the
Local Options Addressing Word Type section for further information.
| OPTION MAJMOD
|
TRANSLEN
| Sets the minimum word length for character transposition matching to be applied. Refer to the
Local Options Addressing Spelling
section for further information.
| OPTION FLAGS
|
TRIGGER
| Decrements the Score if it is greater than or equal to the
TRIGGER Score. Refer to the
Local Options Controlling Reference Record Matching section for further information.
| OPTION NOEXCESS
|
TRIGGER
| Invoke out of order logic if the Score is above the
TRIGGER value. Refer to the
Local Options Addressing Word Order section for further information.
| OPTION ORDER
|
TRIGS
| Invoke
CLIMIT logic if the Score is above the
TRIGS value. Refer to the
Local Options Addressing Long Names or Addresses section for further information.
| OPTION CLIMIT
|
USECATSW
| Enables the CATSW option. For more information about the CATSW option, see the
Local Options Addressing Word Type .
| OPTION REFN
|
USECWAIT
| Allows the maximum Score of a word pair to be less than 10. Refer to the
Local Options Controlling Word Score section for further information.
| OPTION FLAGS
|
WBELOW
| Set the Score for any single word to zero if the raw string word Score is less than a specified value. Refer to the
Local Options Addressing Spelling section for further information.
| OPTION SCORES
|
WORDS
| Specifies the number of non-initial words which need to match in order to return a user-defined Score. Refer to the
Local Options Controlling Reference Record Matching section for further information.
| OPTION REFN
|
WORSTSCR
| Controls which score should be returned when comparing a repeating group. Refer to the
Local Options Addressing Multi-valued Fields section for further information.
| OPTION FLAGS
|
WSCRNOEX
| Defines the amount to reduce the score when comparing a repeating group in which the number of repeats differs between the search and file records. Refer to the
Local Options Addressing Multi-valued Fields
section for further information.
| OPTION FLAGS
|
Editlist Category
| Specifies the Edit-list Category Name to participate in Score reduction. Refer to the
Local Options Addressing Word Type / OPTION CATSW
section for further information.
| OPTION CATSW
|
Editlist Category
| Specifies the Edit-list Category Name to participate in Score reduction. Refer to the
Local Options Addressing Word Type / OPTION CATSS section for further information.
| OPTION CATSS
|
Editlist Category
| Disables an Edit-list Category Name during Matching. Refer to the
Local Options Addressing Word Type / OPTION CATNIGN section for further information.
| OPTION CATNIGN
|
Editlist Category
| Disables an Edit-list Category Type during Matching. Refer to the
Local Options Addressing Word Type / OPTION CATTIGN
section for further information.
| OPTION CATTIGN
|
Editlist Category
| Overrides the meaning of an Edit-list Category Name with a different Category Type. Refer to the
Local Options AddressingWord Type / OPTION CATNREP section for further information.
| OPTION CATNREP
|
Editlist Category
| Overrides the meaning of an Edit-list Category Type with a different Category Type. Refer to the
Local Options Addressing Word Type / OPTION CATTREP
section for further information.
| OPTION CATTREP
|
Editlist Category
| Specifies the Edit-list Category Name to participate in Secondary name matching. Refer to the
Local Options Addressing Multi-valued Fields / OPTION SECCAT
section for further information.
| OPTION SECCAT
|
Word-type
| Specifies the Word-type to participate in re-scoring when using
CLIMIT logic. Refer to the
Local Options Addressing Long Names or Addresses / OPTION CLIMLIST
section for further information.
| OPTION CLIMLIST
|
Word-type
| Specifies the Word-type to participate in Secondary name matching. Refer to the
Local Options Addressing Multi-valued Fields / OPTION SECTYPE section for further information.
| OPTION SECTYPE
|
SEARCH: JOHN ALLEN ANDERSON FILE: JOHN ANDERSON ALLEN
In N3SCL,*C NN R Nick-names NN JONATHON >JOHN < OPTION CATSW VALUE NN,9 SEARCH: JOHN SMITH FILE: JONATHON SMITH
Option
| Description
| Example
|
---|---|---|
OPTION SCORES
VALUE INIT,[number]
| This option controls how an Initial will match against the first character of a word.
[number]
is a value between 0 and 10, where 0 means attribute a 0/10 Score if the Initial matches the first character of the word and 10 means attribute a 10/10 Score if the Initial matches. If
SCALEFTR ,1 is specified,
[number]
can be between 0 and 100. If an Edit-list nickname rule has been defined, for example to replace Bill with William, W. Smith would still match Bill Smith. If this option is omitted, an initial will be compared to a full word using a string comparison and if it matches, will be awarded a Score of 3/10.
|
|
LOPT=(INITLOW)
| The default Score for an Initial matching the first character of a word is 3/10. With the
INIT option (described above), it is possible to raise this Score to a maximum of 10/10 and the
INIT value, by default, is applied to all cases where an Initial matches the first character of a word. In cases where the non-initial words do not match, however, it may be desirable to reduce the value of the Initial/Word Score, say, for example when two family names do not match, but the given Initial of one still matches the given name of the other. The
INITLOW option reduces the significance of initials if all of the noninitial words do not match. The SCORE in such cases is reduced to the default value of 3/10. Provided at least one of the non-initial words match,
INITLOW will not be applied. For example, with VALUE INIT,10 specified, G N HOLLOWAY will match GREG NORMAN HALL with a Score of 076. Using the
INITLOW option the Score is reduced to 030. If there is an exact match between any words in the name the processing of
INITLOW is disabled.
|
|
OPTION FLAGS
VALUE INITCODE,{0/1}
| This option is used to prevent Initials being compared withWords when either is a code. This is used to prevent a high Score being returned in the case where
INIT is also used. A value of
0 turns the option on (i.e. prevents Initials being matched with Words when either is a code), a value of 1 turns the option off. The default is off. For example, with INIT,10, 1 and 176 will Score 3/10. With the INITCODE,0 specified, the comparison will get a Score of 0/10.
| |
OPTION FLAGS
VALUE EXACTWRD,{0/1}
VALUE EXACTINI,{0/1}
| With
EXACTWRD ,1 and
EXACTINI ,1 exact initial to initial matches will be retained, regardless of whether a better Score may have been achieved by matching the initial to a word. For example,
GRIFFIN, JOHN W J GRIFFIN, JAMES W J with
EXACTWRD ,0 and
EXACTINI ,0 (the default), and VALUE INIT ,10 , will score 100, because the initial
J in each name matches exactly with the words John and James respectively. With
EXACTWRD ,1 and
EXACTINI ,1
the Score would be lower, e.g. 080, because John and James are not as good a match.
EXACTINI ,1 requires
EXACTWRD ,1 before it will function.
| |
OPTION FLAGS
VALUE EXACTMCH,{0/1}
| If two records match exactly then a Score of 100 is immediately given, bypassing Formatting. This is not always desirable, for example, in cases where an Edit List rule should be used prior to Matching.
The default is
EXACTMCH ,1 which will result in an early exact match check. Changing to
EXACTMCH,0 switches off exact match check. For example, The following Edit List rules are defined:
With EXACTMCH,1 With EXACTMCH,0
| |
OPTION FLAGS
VALUE SKIPMTCH,{0/1}
| Usually an initial will not match a skip word, using
SKIPMTCH,1 will allow such a match.
SKIPMTCH,0 is the default. For example if University and Technology are skip words:
| With
INIT,10
With
INIT,10 & SKIPMTCH,1
|
OPTION FLAGS
VALUE OPTIMILW,{0/1}
| The default is
OPTIMILW,1 . When
INITLOW is active and it reduces an initial/word Score, a check is done to see if a better word match can be found. If one can, it is used instead of the degraded original match.
To turn off this optimization, use
OPTIMILW,0 . Comparing these two names for example,
With
INITLOW ,
INIT,10 and
OPTIMILW,0 , the Score returned would be 030. This occurs because of two things.
INIT,10 causes
P / PETER to score 10/10 and to be chosen for the match over
PETER / PETERS, and
INITLOW takes effect on the
P/ PETER
match because the
PETER / PETERS pair was not a match, thus decreasing the Score to 030. With
INITLOW ,
INIT,10 and
OPTIMILW ,1, the Score returned would be 080, because a check is done to see if a better word match can be found, in this case the Score of the
PETER / PETERS pair.
| |
OPTION FLAGS
VALUE ILOWWRDS,{0/1}
| This option is used in conjunction with
INITLOW to reduce the score for an initial-to-word match (to 3/10) if there are any unmatched words between the two names. To turn it on, specify
ILOWWRDS,1 .
The default is
ILOWWRDS,0 . For example, without
ILOWWRDS,1
(and assuming
REFMIN and
INIT,9 ):
| With
ILOWWRDS,1 :
|
OPTION SCORES
VALUE ILOWTRIG,[number]
| This option controls the value for a word Score to be considered a match by the
INITLOW processing.
The default is 10, i.e. if an initial / word match is present and two other words do not match 10/10,
INITLOW processing will take place. Changing the value to 8 (as an example) will prevent
INITLOW degrading the Score of an initial / word match when two other words are considered a reasonable match (in this case 8 / 10). If
SCALEFTR,1 is specified,
[number] can be between 0 and 100. For example, with options:
With the additional option:
the Score becomes 090 because the J / JOHN
match is not effected by
INITLOW . This is because the
SMITH / SNITH match is 8/10 and the
ILOWTRIG option causes
INITLOW processing to be bypassed.
| |
LOPT=(ABBRMIN)
| ABBRMIN sets the minimum length of an abbreviation that can match. For example assuming
ABBRMIN*3
is specified. If a word of length 3 or more matches the beginning of another (longer) word, the Score specified with the
ABBRSCR option is returned. In other words the short word is an abbreviation of the long word. Using the
ABBRSCR example, ROBE --> ROBERT matches ROB --> ROBERT matches ROBIN --> ROBERT doesn’t match Note that the shorter of the two words must still be a 100% match with the beginning of the longer word for this logic to be invoked. matches ROBIN --> ROBERT doesn’t match Note that the shorter of the two words
| |
LOPT=(ABBRSCR)
| Sets the Score for an abbreviated match, e.g.. 8 = 80%, 10 = 100%. When two words match according to the
ABBRMIN rules the Score specified here is returned for the match on the two words. For example, 1. With no
ABBRMIN or
ABBRSCR
With
LOPT=(ABBRMIN*3+ABBRSCR*10)
| |
OPTION FLAGS
VALUE FMTINIT,{0/1}
| FORMATTING-OPTIONS #9 controls how Formatting treats a run of two or more initials. If it is set to a value other than ’N’, initials will be concatenated. This is the normal behavior for company and mixed company/person algorithms. This is important for keys and search strategies so that, for example, ABC HOLDINGS is able to successfully find A.B.C. HOLDINGS. Formatting options also affect matching in that a name is processed through Formatting prior to being matched. This behavior, however, may be undesirable in cases such as when a search for J W SMITH finds JOHN SMITH. The two formatted names that get compared would be JW SMITH and JOHN SMITH and the JW and JOHN do not match well. By setting FMTINIT to 0 (1 is the default),
FORMATTING-OPTIONS #9 is set to ’N’ (do not concatenate initials) for matching. This does not affect the key-building or searching. When using this option, if it is still desirable to have matching try concatenating the initials, then the options
CONC and
CINITI (or
CINITA ) should also be specified.
| |
OPTION CONCINIT
VALUE THRSHOLD,[Score]
VALUE MININIT,[number]
VALUE MAXINIT,[number]
VALUE ALLWSKIP,{0/1}
VALUE SCORE,[Score]
VALUE PENALTY,[number]
VALUE NORSCORE,{0/1}
VALUE PARTMTCH,{0/1}
VALUE SKIPGOOD,{0,1}
| The
CONCINIT option allows matching of acronyms to full names. For example:
An acronym may be retrieved as a candidate in a search by using the INITPROBE or
INITRANGE NAMESET function keywords. An acronym and full name may also become a search and file record in matching because of a search on another field (e.g. address). Acronym matching, if done, takes place at the end of the matching process, after an original Score has been computed. Acronym matching will only be attempted if the original Score is below the
THRSHOLD value. The default threshold score value is 80. The
MININIT and
MAXINIT values set the minimum and maximum number of words in the full name that can be matched to the acronym (starting from the left). For example, it would be typical to set
MININIT at 3 (the default) because most acronyms start at three words. A reasonable
MAXINIT value would be 8 (the default). By default, Skip Words are allowed to participate in acronym matching. Skip Words can be disallowed in acronym matching by setting
ALLWSKIP to 0. By default, a successful acronym match will return a Score of 100. It may be desirable to set the maximum Score lower. This can be achieved with the
SCORE value setting. Using the
PENALTY value, it is possible to decrement the acronym Score by the number of excess words in the non-reference record. If
PENALTY is omitted, no penalty is applied for excess words. By default, the acronym Score is returned only if it is greater than the original Score. By setting
NORSCORE to 0, the acronym Score is returned whether it is greater or lesser than the original Score. For looser matching, specify
PARTMTCH,1 . This allows part of the acronym to match and a score to be computed relative to the number of initials that matched. For example,
will score 66 if
PARTMTCH,1 is specified. 0, the default, does not allow part acronym matching and the Score would be 0. By default, words that match 100% are included in the
CONCINIT rescore. By setting
SKIPGOOD to 1, words that match 100% are excluded from the
CONCINIT rescore.
|
Option
| Description
| Example
|
---|---|---|
LOPT=(CONC)
| Allow concatenated matches. This option allows concatenated words to match against separate words. For example, when matching,
with
The
HACKFORTH JONES will match to produce a total Score of 100% with the
CONC option. Without it a Score of 75% is returned.
| |
LOPT=(CINITM)
| Allow multiple concatenations. This option allows the concatenation of more than two words. It requires that
CONC is also specified.
| For example,
|
LOPT=(CINITI)
| Allow concatenation of initials. Requires that
CONC is also specified.
| |
LOPT=(CINITA)
| Allow both initials and multiple concatenations. Shorthand for specifying both
CINITI and
CINITM . Requires that
CONC is also specified.
The syntax is:
| |
OPTION CONCAT
VALUE PLURALS,{0/1}
VALUE RAW,{0/1}
VALUE SCORE,[maximum Score]
VALUE THRSHOLD,[threshold Score]
VALUE ORIGWORD,{0/1}
| By setting PLURALS to 1, a trailing S on one of the two words/concatenated words will match 100%. Default value of
PLURALS is 0. Setting
RAW to 1 will perform a raw compare and accept the match if it is above the
threshold Score .
|
Option
| Description
|
---|---|
LOPT=(NOORDER)
| Normally, any Scores over 75 are degraded by 1 for each out-of-order word pair (or by larger amounts if
OPTION ORDER is used). This option disables that feature.
|
OPTION ORDER
VALUE POS,[number]
VALUE SEQ,[number]
VALUE TRIGGER,[number]
| Normally any Scores over 75 will cause out-of-order word checking to be enabled. Default out-oforder word checking will decrement a Score by 1 for each out-of-order word pair. This processing can be turned off with the
NOORDER option. To change the default trigger Score of 75, use the
TRIGGER option. Out-of-order means either out-of-position or out-of-sequence. To explain the meaning of outof- position and out-of-sequence, refer to the following example. The following two names have words out of position (SMITH vs ALAN), but not out of sequence (SMITH follows JOHN in both cases),
If the default out-of-order processing is used (i.e. no
NOORDER and no
OPTION ORDER ), and assuming
REFMIN is also used, these two names will score 99. If it is desired to only decrement the Score if the names are either out-of-position or out-of-sequence, use the
VALUE POS or
VALUE SEQ options. These options are mutually exclusive. Use the
VALUE POS option to specify a value (between 0 and 100) by which to decrement the Score for each word out-of-position. Use the
VALUE SEQ option to specify a value (between 0 and 100) by which to decrement the Score for each word out-of-sequence.
|
OPTION ORDER
VALUE PER,[penalty]
VALUE PERFLAG,{0,1,2}
| Specifying
VALUE PER,n causes an additional check of the first and last words in the two names to be performed. If the two words are different then penalty
n is applied to the score. E.g: Not using
VALUE PER,n
Using
VALUE PER,1
In addition to
VALUE PER ,n ,
VALUE PERFLAG ,m may be specified. Note that this option has an effect only where one of the two word stacks contains a single word. In these cases, the value of
m modifies the behavior as follows:
Always apply the penalty. This is the default.
Ensure that the matching word is the first in each stack before applying the penalty. Use the
NAME-FORMAT setting to determine the meaning of first. i.e If
NAME-FORMAT=L , then the matching word must be the leftmost words. If
NAME-FORMAT=R , then the matching word must be the rightmost words.
Ensure that the matching word is the first in each stack before applying the penalty, irrespective of the
NAME-FORMAT setting. i.e. the matching word must be the leftmost words. In the case where both names contain a single word, then this option has no effect.
|
Option
| Description
| Example
|
---|---|---|
XOPT=(EXCTCODE)
| This option specifies that codes only match if they match exactly. For the definition of a ’code’ see the
Formatting Options
section. See also cLocal Option INITCODE
.
| For example,
|
OPTION FLAGS
VALUE EXACTWRD,{0/1}
| With
EXACTWRD,1 exact word to word matches will be retained, regardless of whether a better Score may have been achieved by other means. This situation arises mainly if the
MAJMOD option has also been set.
| For example,
with
EXACTWRD, 0 (the default), and
MAJMOD*20 , will score around 070; with
EXACTWRD,1
and
MAJMOD*20 , will score lower, e.g. 056. This is because with
MAJMOD applied to the major words (HONG &
HOANG ) they score higher than the exact match for the words
HOANG &
HOANG , unless
EXACTWRD,1 is used, in which case the exact match words take precedence.
|
LOPT=(MAJMOD*[number])
| The
MAJMOD option tells the Entry Point to modify its Score if a match was found on a major word. This is done by applying a scaling factor to any major word (as flagged by the Formatting routine) found in the name.
For example when matching the name KEN JOHN BROWN the names KEN, JOHN and BROWN each contributes equally to the Score. However, the
MAJMOD option can be used to give more importance to the major word (BROWN in this case, i.e. if Algorithm
NAME-FORMAT=R .)
Giving a value for
MAJMOD of 10 will scale the major word by 1 (i.e. not scale it at all) and will give the same behavior as omitting the option. A value of 20 will scale by 2, etc. For example,
will cause the importance of the major word to be doubled in the final Score calculation. This is achieved by increasing both the score and the weight for the major word. If the
MAJMOD value is less than 10, the weight is not reduced below 100.
| For example,
If using
MAJMOD , also consider using
EXACTWRD,1 . When using
OPTION MAJMOD (see below), if
SCALEFTR,1
is also specified then the above scale values should be multiplied by 10, e.g. MAJMOD*200 rather than MAJMOD*20.
|
OPTION MAJMOD
VALUE LEVEL,{0,1,2,3}
VALUE MOVEMNT,{0,1,2}
VALUE SCORE,{n}
VALUE THRSHOLD,{n}
| These options allow
MAJMOD to be enabled but with finer control. The setting of LEVEL defines the rules which activate
MAJMOD score modification.
LEVEL,0 . Uses the
MAJMOD option when the major word in one name matches with any word in the other name.
LEVEL,1 . Uses the
MAJMOD option when the major words in both the names match and share the same position. The
MAJMOD option uses the same rules as that of the
LOPT=MAJMOD option except that the
MAJMOD option cannot reduce the score.
LEVEL,2 . Uses the
MAJMOD option when the major word in one name matches with the major word of other name that is in the same or adjacent position.
LEVEL,3 . Uses the
MAJMOD option when the major words in both the names match and share the same position. The
MAJMOD option uses the same rules as that of the
LOPT=MAJMOD option.
VALUE MOVEMNT . Dictates how
MAJMOD can affect the score; negatively, positively or both.
MOVEMNT,0 (the default) indicates that
MAJMOD can increase or decrease the score.
MOVEMNT,1 indicates that
MAJMOD can increase the score, but not decrease it.
MOVEMNT,2 indicates that
MAJMOD can decrease the score, but not increase it.
VALUE SCORE . Word Score that assigns to the major word comparison. With
SCALEFTR,10 it can be set with values of n from 0 to 120. With
SCALEFTR,1 it can be set with values of n from 0 to 1200, In either case, the default is 0.
VALUE THRSHOLD is the word Score threshold above which to activate this logic. It can have a value of n from 0 to 100, the default is 100. This value is unaffected by the setting of
SCALEFTR and should always be specified in the range 0-100.
| |
OPTION FLAGS
VALUE SKIPMAJM,{0/1}
| By default, an exact match check is done on two names before any other processing. If an exact match is found, the method exits early with a score of 100. In some rare cases, it may be desirable to bypass the exact match check if
MAJMOD is specified. This is because
MAJMOD can be used to lower the score if the major words match. To bypass the exact match check when
MAJMOD is specified, set
SKIPMAJM,1 .
SKIPMAJM,0 (the default) will enable the exact match early-exit.
| |
XOPT=(SKIPMOD*[number])
| The
SKIPMOD option tells the Method to modify its Score if a match was found on a skip word (as flagged by the Formatting routine).
Giving a value for
SKIPMOD of 10 will scale the Score for matching skip words by 1 (i.e. not scale them at all) and will give the same behaviour as omitting the option. A value of 5 will cause the importance of the skip words, if they match, to be halved in the final calculation of the Score – this has the effect of increasing the importance of the non-Skip words in the name.
| For example, if the Edit-list contains the following rules,
|
OPTION FLAGS
VALUE SKIPSMOD, {0/1}
| By default, an exact match check is done on two names before any other processing. If an exact match is found, the method exits early with a score of 100. In some rare cases, it may be desirable to bypass the exact match check if
SKIPMOD is specified. This is because
SKIPMOD can be used to lower the score if skip words match. To bypass the exact match check when
SKIPMOD is specified, set
SKIPSMOD,1 .
SKIPSMOD,0 (the default) will enable the exact match early-exit.
| |
OPTION CATSW
VALUE [Edit-list Category],[Number]
| The Name Matching Method by default passes the names to be matched through both the Cleaning and Formatting routines. The Formatting routine, among other things, transforms the name components according to rules in the Edit-list. The Categories of any Edit-list rules which have been applied are then passed back to the Method. The default maximum Score for a word comparison is 10/10. The
CATSW option is used for ranking purposes to reduce the score of certain words which were originally different but changed via Edit-list rules to match. For example, Nickname Replacement or Secondary Name rules. This will have the effect of reducing the overall score. For example, if a search was done on JOHN BROWN ADVERTISING then both JOHN BROWN MARKETING and JOHN BROWN ENGINEERING would score the same; however, it may be desirable to rank JOHN BROWN MARKETING above JOHN BROWN ENGINEERING.
| This can be achieved by first defining the ’similar’ words in the Edit-list in a manner similar to the following,
Then,
As Marketing and Engineering do not match via Edit-List rules, the score remains at 66%.
CATSW can now be used with a Secondary Name Edit-list category (in the fast-starts this is known as
SN ). This is useful when using
CATSW to degrade the Score of nickname fields for ranking. In N3SCL, only
NN(R) and
NK(N) categories can be used.
To reduce the maximum Weight of a word which is defined in an Edit-list category, use either LIMWCAT, 0 with CATSW, or use CATSS.
|
OPTION CATSS
VALUE [Edit-list Category],[Number]
| CATSS is used to reduce the maximum Weight (significance) of certain words which were originally different but changed via Edit-list rules to match.
LIMWCAT is always set to 0 for
CATSS . If a Category is defined for both
CATSW and
CATSS ,
CATSW will take precedence.
| |
OPTION FLAGS
VALUE CATSWD,{0/1}
| By setting
CATSWD to 1,
CATSW and
CATSS processing will be bypassed when an Initial to Word match is being processed and the Word is in a
CATSW or
CATSS category.
The "D" stands for "Disable".
| |
OPTION FLAGS
VALUE CATSWEXT,{0/1}
| By setting
CATSWEXT to 1, an exact match before formatting between two words will now score 100 even if a word belonged to one of the categories specified via
CATSW or
CATSS .
| |
OPTION FLAGS
VALUE CATSWF,{0/1}
| By setting
CATSWF to 1,
CATSW and
CATSS processing will be performed even if
MAJMOD processing is done.
The "F" stands for "Force".
| |
OPTION FLAGS
VALUE EXACTCAT,{0/1}
| By setting
EXACTCAT to 1, an exact match after formatting between two words will now score 100 even if a word belonged to one of the categories specified via
CATSW or
CATSS which specified that the word’s Score should be reduced.
| For example, if the Edit-list contains the following rules,
Then,
For more information, see the
Word Weight Modification section in N3SCM.
The default is 0.
|
OPTION FLAGS
VALUE LIMWCAT,{0/1}
| The default behavior ( LIMWCAT,1 ) does not allow the
CATSW option to use a maximum Weight less than 10. By setting
LIMWCAT to 0, the maximum Weight of a word which is in a category defined by the
CATSW option can be less than 10. For more information, see the
Word Weight Modification section in N3SCM.
| |
OPTION CATNREP
VALUE nnt,0
| CATNREP allows you to override the meaning of an Edit-list Category Name while performing Matching. "nn" is the Edit-list Category Name, and "t" is the new Category Type. The value 0 is ignored but must be present. Multiple VALUE statements are permissible.
| |
OPTION CATNIGN
VALUE nn,0
| CATNIGN allows you to entirely disable, or ignore, an Edit-list Category Name while performing Matching. "nn" is the Edit-list Category Name. The value 0 is ignored but must be present. Multiple VALUE statements are permissible.
| |
OPTION CATTREP
VALUE ct,0
| CATTREP allows you to override the meaning of an Edit-list Category Type while performing Matching. "c" is the Edit-list Category Type, and "t" is the new Category Type. The value 0 is ignored but must be present. Multiple VALUE statements are permissible.
| |
OPTION CATTIGN
VALUE t,0
| CATTIGN allows you to entirely disable, or ignore, an Edit-list Category Type while performing Matching. "t" is the Edit-list Category Type. The value 0 is ignored but must be present. Multiple VALUE statements are permissible.
|
Option
| Description
|
---|---|
OPTION FLAGS
VALUE SKIPCONS,[number]
| Matches multiple consonants with a single consonant. Set the value to 1 to enable this option. Default is 0.
For example, after you set this option to 1, if you match CROSS and CROS, the consonants
SS matches with the consonant
S .
|
OPTION FLAGS
VALUE SKIPVOWL,[number]
| Matches vowels or ignores a vowel when compared with a consonant. Set the value to 1 to enable this option. Default is 0.
For example, after you set this option to 1, if you match ABAD and ABED, the vowel
A matches with the vowel
E .
|
OPTION FLAGS
VALUE SYNCS,[number]
| When raw string matching is used, two names are compared character by character. If two characters do not match, the method will look ahead for the full length of the name for a character match, and attempt to resynchronize the matching from that character forward. This option tells the method how many characters must match in a look-ahead operation for the re-synchronization to be accepted. The default value is 2.
|
OPTION FLAGS
VALUE TRANSLEN,[number]
| When raw string matching is used, two transposed characters are accepted as a match (2/2) if the word length is greater than or equal to
[number] . The default is 1. If the word length is less than
[number] the two transposed characters score 1/2. For example,
John Patterson and John Pattesron score 100 John Bent and John Bnet score 075 |
OPTION SCORES
VALUE STD,[number]
| If two stabilized words match, the default word Score given is 7/10. This option allows the word Score to be increased or decreased. For example,
will give a word Score of 8/10 for a stabilized word match. If
SCALEFTR,1 is specified,
[number] can be between 0 and 100.
|
LOPT=(NOSTD)
| This option disables the stabilized matching of two words. With this option set, no stabilized comparisons will take place.
|
LOPT=(NORAW)
| This option disables the raw string matching. With this option set, no raw string comparisons will take place.
|
OPTION SCORES
VALUE ORIGWTHR,[number] VALUE ORIGWSCR,[number]
| If the initial Score for a match is below the
ORIGWTHR threshold value (a value between 1 and 100),
ORIGWSCR logic will recalculate the Score on each ’unformatted’ word, i.e. after Cleaning but without Edit-list processing. A raw string comparison will be done on the words. If the result is a Score less than the maximum possible stabilized word score, a comparison is also done on the stabilized form of the words, and the higher of the two scores used. The
ORIGWSCR value is then used to scale the Score, and the resulting Score will be used if it is greater or equal to the initial Score. For example, if there was an Edit-list rule replacing Nathan with Nathaniel:
Without
ORIGWSCR (or
ORIGWSCR,0 ):
(as we are actually comparing NATHANIEL to
NATHON due to the activation of an Edit-list rule).
With
(as we are now comparing NATHAN to
NATHON and using the higher Score).
|
OPTION FLAGS
VALUE MATCHEND,[number]
| Allows a string match (raw compare) to resync even at the last character
[number]
defaults to 0. For example, using the defaults for SYNCS and MATCHEND, Tiene vs Tienne scores 6/10.
With
|
OPTION SCORES
VALUE WBELOW,[Number]
| Set the Score for any single word to zero if the raw string word Score is less than a
[number]
where
[number] can be between 1 and 100. For example, Not using WBELOW:
With VALUE WBELOW,75
|
OPTION FLAGS
VALUE RAWCMPTN,{0/1}
| The default setting of 1 causes a raw string compare of a word to be calculated out of 10 and any remainder is dropped (e.g. a score of 8.7/10 will become 8/10 or 80/100). By changing the setting to 0, the raw string compare will be calculated out of 100 (e.g. 87/100 will result in a word score of 087). For example, Without RAWCMPTN (or RAWCMPTN,1):
With RAWCMPTN,0
|
OPTION SCORES
VALUE RAWSTBTH,n
VALUE RAWSTBVL,n
| If the score from the raw string compare is greater than RAWSTBTH then improve the score using the following formula:
It increases the score by a factor based on the raw score and stabilized score. The default value for RAWSTBTH is 0, which disables this option.
|
Words
| Opts
| Scr
| Comment
|
---|---|---|---|
KAN KON
| 70
| Both the stabilized and raw compare are performed and the highest Score is used. The raw compare Scores 6/10 (2/3 characters match), however the words stabilize to the same and score 7/10 (the default). The Score is therefore 70.
| |
KAN KON
| NOSTD
| 60
| Only the raw compare is performed and a Score of 60 is returned (2/3 characters match giving 6/10).
|
KAN KON
| NORAW
| 70
| Only the stabilized compare is performed. Because the two words are the same afterWord Stabilization, they score 7/10 (the default) and a Score of 70 is returned.
|
KAN KON
| NOSTD,
NORAW
| 00
| Using both options forces an exact match comparison on the words after they have been processed through the Edit-list. As the two names are not exactly the same the Score is 0.
|
ABCDEFGKAN
ABCDEFGKON
| 90
| The raw compare returns a higher value (9/10) than the stabilized compare which defaults to 7/10.
| |
ABCDEFGKAN
ABCDEFGKON
| NOSTD
| 90
| Same result as above as the stabilized compare was overridden by the raw compare anyway.
|
ABCDEFGKAN
ABCDEFGKON
| NORAW
| 70
| The stabilized words return an exact match, which defaults to 7/10.
|
Option
| Description
|
---|---|
OPTION CLIMIT
VALUE TRIGS,[number]
VALUE CHARDS,[number]
VALUE NSACTF,{0/1}
VALUE NOINCR,{0/1}
VALUE AVERAGE,[number]
VALUE NOEXPNTY,[number]
VALUE RECREF,{0/1}
| CLIMIT logic is executed if the initial Score for a match is above the TRIGS value (a value between 0 and 100). If the initial Score is 100, however, the
CLIMIT logic is bypassed.
CLIMIT logic will recalculate the Score while allowing only those words whose types appear in
CLIMLIST to participate in this recalculation. Words which scored above a user-defined limit (CHARDS ) may also be excluded from the recalculation.
CHARDS can have a value of between 0 and 100.
The
NSACTF (No Match Action Flag) flag can be set to dictate what Score to return when no new Score is possible. Valid values: 0 and 1. A value of 0 will return a Score of 0 if no new Score is possible. A value of 1 will return the original Score. The default is 0.
The
NOINCR flag can be set so as to remember the original Score and if the original Score is less than the recalculated Score then the original Score is used. This is useful to allow bad matches to decrease the Score but prevent good matches from increasing the Score.
Using
AVERAGE and a value of
N , where
N defaults to 10, means that the ultimate Score is calculated as:
This will allow a mix of the effect of the original and the recalculated Scores to be created without having to use two methods on the same field.
It would be quite normal to use
NOINCR and
AVERAGE in combination, as well as in combination with all the other options.
NOEXPNTY will reduce the final
CLIMIT score by the number of unmatched
CLIMLIST tokens times the
NOEXPNTY value.
The
RECREF flag, when set to 1, will cause re-calculation of the
REFxxx record (as defined in the
GOPT parameter) based on the word types in
CLIMLIST . A value of 1 is recommended. For example, matching the following addresses:
The
SEARCH record is initially selected as the Reference record and a score of 80 returned.
CLIMIT processing is specified for codes only (shown above in Italics). Using RECREF,0 a Score of 66 is returned, as the original reference record is used. By using RECREF,1, CLIMIT will re-calculate the Reference record based on Codes only, and a Score of 100 is returned as now the
FILE record is used as the Reference. (See the
AVERAGE option on how to average the before and after scores.)
|
OPTION CLIMLIST
VALUE [word-type][word-type]. . . ,0
| This option defines the word types to be compared during
CLIMIT Score recalculation. The list of types is used only by the
CLIMIT option and has no effect if the
CLIMIT option is not enabled. If this list is not specified a default list containing categories Y (non-major words), C(odes) and I(initials) are used.
A maximum of 8 word types may be listed.
The following example will force
CLIMIT logic to recalculate the Score while only using words of type S(kip), M(ajor) and I(nitials).
A value field (0 above) is required but is not actually used.
APPLICATION REFERENCE guide > NAMESET section
.
|
SEARCH: 56 VALLEY RD NEWTOWN WA 2365 FILE: 56A VALLEY RD NEWTOWN WA 2365 SCORE: 090
SEARCH: 56 VALLEY RD NEWTOWN WA 2365 FILE: 56A VALLEY RD NEWTOWN WA 2365 SCORE: 080
OPTION CLIMIT VALUE TRIGS,80 OPTION CLIMLIST VALUE C,0 SEARCH: 56 VALLEY RD NEWTOWN WA 2365 FILE: 56A VALLEY RD NEWTOWN WA 2365 SCORE: 085
SEARCH: ANIMAL STORIES FROM OUTBACK AFRICA: ELEPHANTS FILE: ANIMAL STORIES FROM OUTBACK AFRICA: TIGERS SCORE: 090
OPTION CLIMIT VALUE TRIGS,80 VALUE CHARDS,90 OPTION CLIMLIST VALUE YM,0 SEARCH: ANIMAL STORIES FROM OUTBACK AFRICA: ELEPHANTS FILE: ANIMAL STORIES FROM OUTBACK AFRICA: TIGERS SCORE: 010
SEARCH: SNAPPY INVESTMENTS FILE: ABC HOLDINGS T/AS SNAPPY INVESTMENTS SCORE: 100
Option
| Description
|
---|---|
OPTION FLAGS
VALUE SECOND,{0/1/2/3/4/5}
|
Edit-list is set-up with the appropriate values. For example, assuming NEWTOWN is next to MIDTOWN, and MIDTOWN is next to OLDTOWN, but NEWTOWN is not next to OLDTOWN, then the Edit-list should contain at least the following:
Then, if using:
|
OPTION FLAGS VALUE SECPHRSE,{0/1/2}
| This option allows you to create secondary phrase names. This matching option is equivalent to NAMESET function keyword
SECPHRASE or
SECPHRASEALL . Value 0 turns this feature off, 1 turns the feature on (NAMESET
SECPHRASE ) and 2 creates all secondary names (NAMESET
SECPHRASEALL ). The default is 0.
|
OPTION FLAGS VALUE SECPORIG,{0/1}
| This option allows you to include original names before secondary phrase rules are applied. This matching option is equivalent to NAMESET function keyword
SECPHRASEORIG . Value 0 turns this feature off and 1 turns the feature on (NAMESET
SECPHRASEORIG ). The default is 0.
|
OPTION SCORES VALUE SECPHRSE,n
| This option allows you to specify the maximum score to apply to secondary phrase matches. Value 0 turns this feature off and will assign a maximum score of 100. The default is 0.
|
OPTION SCORES OPTION SECCAT
VALUE [Edit-list Category],1
| Requires that both Secondary words being matched are in the
[Edit-list Category]
specified. Multiple Edit-list Categories can be specified using multiple
VALUE statements.
Requires
OPTION FLAGS ,
VALUE SECOND to be set to non-zero value. For example,
will only perform Secondary name matching on words which are in the SN Edit-list category.
|
OPTION SECTYPE
VALUE [Word-type],1
| Requires that both Secondary words being matched have the
[Word-type] specified. Multiple Wordtypes can be specified using multiple
VALUE statements.
Requires
OPTION FLAGS ,
VALUE SECOND to be set to non-zero value. For example,
will only perform Secondary name matching on words which have a Word-type of S (Skip).
|
OPTION SCORES
VALUE SEC,[Number]
| This option allows you to specify the word Score, from 0 to 10, for Secondary Word matches. The default is 10.
|
METHOD NAME=MNAME,WEIGHT=1, X GOPT=(LENGTH*50+REFMIN), X LOPT=(CONC+CINITA+INITLOW) FIELD OFFSET=0,REPEAT=2
Search: John Smith File: Mike Taylor John Smith
Option
| Description
|
---|---|
OPTION FLAGS
VALUE WORSTSCR,{0/1/2}
| In the descriptions below, the search and file records are assumed to contain the following:
The search record contains 3 fields in a repeating group:
And the file record contains 4 fields in a repeating group:
|
OPTION FLAGS
VALUE WSCRNOEX,[number]
| This option only has an effect when
WORSTSCR is set to 1 or 2. It is used to penalize the score in the case where the number of repeats differs between the search and file records. The score is reduced by this value for each extra field present. So, if
WSCRNOEX is set to 3 and using the sample data above, the score will be reduced by 3 (3 * (4 - 3)). If specified, it should be in the range 1-10. The default is 0, which means that no reduction in score occurs.
|
Option
| Description
|
---|---|
OPTION FLAGS
VALUE USECWAIT,{0/1}
| The default behavior of N3SCM does not allow the maximum Score of a word pair to be less than 10.
This affects the word weight modifying options
MAJMOD ,
SKIPMOD and
CATSW . By setting USECWAIT to 1, the maximum Score of a word pair is allowed to be less than 10. For more information, see the
Word Weight Modification
section in N3SCM.
|
OPTION FLAGS
VALUE SCALEFTR,{1/10}
| The default behavior of N3SCM scores a word pair out of 10. By setting SCALEFTR to 1, it will score the word pair out of 100, providing a finer calculation. This mostly affects ranking options (such as
CATSW ), which can now be set out of 100 instead of 10, allowing a smaller reduction in score.
For example, using the default SCALEFTR,10 the following can be specified:
Then when matching the following two names, assuming Rick is defined in the Edit-list with a category of NN:
will score 95. When using the SCALEFTR,1 (i.e. word score out of 100) the following can be specified:
and will now score 97. |
Option
| Description
|
---|---|
OPTION NOEXCESS
VALUE TRIGGER,[trigger Score]
VALUE PENALTY,[penalty Score]
VALUE REFCNT,[maximum word count]
VALUE REFMULT,[penalty Score multiplier]
VALUE SREFCNT,[maximum skip word count]
VALUE SREFMULT,[penalty Score multiplier]
VALUE REFF,{0/1}
VALUE REFS,{0/1}
| When the Global Option REFMIN is specified (use the shorter record as the reference record – see the
Global Options
section for more details),
NOEXCESS can be used to decrement the method Score by the number of non-matching words in the non-reference (longer) record. The method Score must be equal to or greater than
[trigger Score] for this option to take effect. When
NOEXCESS is activated, the method Score is decremented by
[penalty Score] for each non-matching word in the non-reference record.
For example, with
with, with,
An additional penalty can be applied if the number of words present in the
REFMIN record’s Wordsstack is equal to the number defined by
REFCNT . The default for
REFCNT is 1. The
REFCNT syntax allows any number of words to be specified; however this behavior was originally designed for cases when only one word was present in the
REFMIN name. For example:
REFCNT is used in conjunction with
REFS &
REFF to determine if the additional penalty is to be applied. If it is to be applied, the value of
REFMULT is multiplied by the penalty Score and the method Score is decremented further by the resulting value.
Specifying
REFS (the default) causes the logic to only check for the
REFCNT condition if the
REFMIN record is the Search record. Specifying
REFF causes the logic to only check for the
REFCNT condition if the
REFMIN record is the File record. Specify both if either Search or File record can be checked.
For example, with,
with,
This logic in effect says, if the
REFMIN record is the Search record, and it contains only one word, subtract an additional penalty from the Score equal to
REFMULT x PENALTY .
SREFCNT and
SREFMULT act similarly, however, with the added restriction that the word must be a Skip word.
|
OPTION NOEXCLSTVALUE [Word-type],0
| If the number of words in a name changes due to the action of the
CONC option then
NOEXCESS will not degrade the Score. It may be desirable to have
NOEXCESS to count the number of words in a name after any concatenation has occurred. Switching on
NOEXCLST to re-Score particular Word-types will allow this. For example, with
But adding:
|
OPTION REFN
VALUE WORDS,[number of words]
VALUE GOOD,[word Score]
VALUE SCORE,[method Score]
VALUE USECATSW [0|1]
| Returns a user-defined Score
[method Score] when a specified number of non-initial words
[number of words] match with a Score of at least
n/10 [word Score] . This is irrespective of how other data in the name may have or may have not matched.
The default matching behavior (and the way N3SCM works) is that every token (word or initial) in the reference record will contribute to the method Score by how well it matched. This option allows a method Score to be determined based on the matching of only certain number of tokens from the reference record.
One use for this option is when the need is to confirm a match of two names which have already been identified as having the same id-number (e.g. same social security number).
For example, if the following values were defined in the matching scheme,
and a search on id-number returned the following two names,
the Score returned by the method would be 95, based on the matching of the two words JOHN and THOMPSON.
You can use the USECATSW option to enable the CATSW option. The value 1 indicates to enable the CATSW option and the value 0 indicates to disable the CATSW option. Default is 0.
|
OPTION CODESCOR VALUE CODEWGHT,<codeweight> VALUE CODEUDIF,25 VALUE CODEUNON,90 VALUE CODEUONE,50 VALUE CODEMAXD,6 VALUE CODEPOSS,1 OPTION SORTSCOR VALUE CLN,<clnweight> VALUE FMT,<fmtweight> VALUE SORTWGHT,<sortweight> VALUE NGRAMC,<weight> VALUE NGRAMCLV,<level> VALUE NGRAMF,<weight> VALUE NGRAMFLV,<level>
Normal Score Weight = 100 - (Sort Score Weight + Code Score Weight) Final Score = ((Sort Score * Sort Score Weight) + (Code Score * Code Score Weight) + (Normal Score * Normal Score Weight)) / 100)
Normal Score Weight = 100 - Sort Score Weight Final Score = ((sortScore * Sort Score Weight) + (normalScore * Normal Score Weight)) / (100 - Code Score Weight));
Code Score = 100 - (100 * Number of Codes in Stack / Number of Entries in the Stack That Has Codes) if (Number of Entries in the Stack That Does Not Have Codes < Number of Entries in the Stack That Has Codes) then Weight = 100 * Number of Entries in the Stack That Does Not Have Codes / Number of Entries in the Stack That Has Codes Code Score = Code Score * Weight / 100 endif
"G " grams "L " litres "M " metres "KG " kilograms "MG " milligrams "KL " kilolitres "ML " millilitres "CL " centilitres "DL " desilitres "MM " millimetres "CM " centimetres
Intermediate Score = 100 * Smaller Number / Larger Number
VALUE CODEUDIF,25 VALUE CODEUNON,90 VALUE CODEUONE,50
score = Number of Matching Characters * 100 / Total Number of Characters in the Code
score = (Number of Matching Characters * 100 * Length of Shorter Code) / (Length of Longer Code * Length of Longer Code)
if (reflen <= Number of Matching Characters) then score = 100 else score = 100 * Number of Matching Characters / reflen endif
and the file name is"AC/DC "
then these names are first cleaned and the resulting names are sorted, resulting in"EDC...BA "
"ACCD"
(number of matching characters) * 100 / (Length of reference string)
and the file name word stack is"01 DC " "02 AC "
then these stacks are first sorted, results being"01 BC " "02 AC " "03 DC "
and"AC " "DC "
"AC " "BC " "DC "
(Number of Matching Stack Entries) * 100 / (Total Number of Entries in the Reference Stack)
(Remainder – if any – is dropped.)Score = ((CLN Score * <clnweight>) + (FMT Score * <fmtweight>)) / 100)
which results to 81.Score = ((75 * 75) + (100 * 25)) / 100
which results to 61.Score = ((60 * 75) + (66 * 25)) / 100
(Remainder – if any – is dropped.)Final Score = ((Combined Sort Score * Sort Weight) + (Normal Score * (100 - Sort Weight))) / 100
which results to 86.Final Score = ((81 * 35) + (90 * (100 - 35))) / 100
VALUE NGRAMC,<weight> VALUE NGRAMCLV,<level> VALUE NGRAMF,<weight> VALUE NGRAMFLV,<level>
VALUE CLN,<clnweight> VALUE FMT,<fmtweight> VALUE NGRAMC,<ngramcweight> VALUE NGRAMF,<ngramfweight>
(Remainder – if any – is dropped.)Score = ((CLN Score * <clnweight>) + FMT Score * <fmtweight>) + (NGRAMC Score * <ngramCweight>) + (NGRAMF Score * <ngramFweight>)) / 100)
OPTION CLIMIT VALUE CHARDS,100 VALUE NSACTF,0 VALUE TRIGS,100 VALUE NOINCR,0 VALUE AVERAGE,0 VALUE NOEXPNTY,0 OPTION CONCAT VALUE PLURALS,0 OPTION FLAGS VALUE EXACTCAT,0 VALUE EXACTINI,1 VALUE EXACTWRD,1 VALUE INITCODE,1 VALUE LIMWCAT,1 VALUE MATCHEND,1 VALUE OPTIMILW,1 VALUE SECOND,0 VALUE SYNCS,2 VALUE TRANSLEN,1 VALUE USECWAIT,0 VALUE SKIPMAJM,0 VALUE SKIPSMOD,0 VALUE SCALEFTR,10 VALUE CATSWEXT,0 VALUE CATSWD,0 VALUE CATSWF,0 OPTION NOEXCESS VALUE PENALTY,1 VALUE TRIGGER,101 VALUE REFCNT,1 VALUE REFMULT,0 VALUE REFS,1 VALUE REFF,0 VALUE SREFCNT,1 VALUE SREFMULT,0 OPTION ORDER (ORDER options are disabled by default) VALUE POS,9999 VALUE SEQ,9999 VALUE TRIGGER,9999 OPTION REFN VALUE GOOD,9 VALUE SCORE,99 VALUE WORDS,0 OPTION SCORES VALUE ILOWTRIG,10 VALUE SEC,10 OPTION CODESCOR VALUE CODEWGHT,0 OPTION SORTSCOR VALUE SORTWGHT,0