Populations and Controls

Populations and Controls

Fuzzy Matching

Fuzzy Matching

The fuzzy matching returns scores that can range from 0 through 100% based on how close the search data and file data values match. You can perform fuzzy matching on any data type.
Use the following format to perform fuzzy matching:
SEARCH=<Field Name>[(<Field ID>;<Algorithm Name>[:<Upper Score Limit>;<Lower Score Limit>] FILE=<Field Name>[(<Field ID>;<Algorithm Name>[:<Upper Score Limit>;<Lower Score Limit>]
The format uses the following parameters:
  • Field Name
    . Name of the SSA-NAME3 field.
  • Field ID
    . Number that you want to assign to the extended field. If you want to extend a key field, use any number as the
    Field ID
    value. If you do not want to extend a key field, use 0 as the
    Field ID
    value. If you extend a key field in the search data, extend the corresponding key field in the file data.
  • Algorithm Name
    . Name of the supported algorithm that performs the fuzzy matching.
    Use one of the following values:
    • Levenshtein
      . Scores the records based on the minimum number of single-character edits required to change one word into the other.
    • Dice
      . Scores the records based on the Dice's coefficient.
    • JaroWinkler
      . Scores the records based on the minimum number of single-character transpositions required to change one word into the other.
  • Upper Score Limit/Lower Score Limit
    . Minimum score required for the records to be considered as matches or unique records.
    If a score is more than the upper score limit, the records are considered as a match. If a score is less than the lower score limit, the records are considered as unique records. If a score is between the upper and lower score limits, the records are considered as undecided matches. Default is 0.
The following sample expressions perform fuzzy matching between the search data and file data values:
  • SEARCH=Attribute1(1;Levenshtein:79;60) FILE=Attribute1(1;Levenshtein:79;60)
    . The expression extends the Attribute1 field and uses the Levenshtein algorithm to perform fuzzy matching.
  • SEARCH=Attribute1(0;JaroWinkler) FILE=Attribute1(0;JaroWinkler)
    . The expression does not extend the Attribute1 field and uses the JaroWinkler algorithm to perform fuzzy matching.

0 COMMENTS

We’d like to hear from you!