MDM Registry Edition
- MDM Registry Edition 10.2
- All Products
PURPOSE=<expression> [MATCH_LEVEL=matchlevel][+/-nn][+/-nn] [ADJWEIGHT=field,+/-n] [NAMEFORMAT=name format] [EARLYEXIT=Y/N] [UNICODE_ENCODING=Unicode type] [ENCODING=Unicode type] [DELIMITER=delimiter] [MATCH_OPTIONS=(fieldname:option,...)] [MIN_MATCH_FIELDS=number of match field pairs] [SEARCH=field1[(matching type expression1)],offset1,length1,... , fieldn[(matching type expressionn)],offsetn,lengthn] [FILE=field1[(matching type expression1)],offset1,length1,... , fieldn[(matching type expressionn)],offsetn,lengthn] [FMTOVERRIDES=Field1[(<Field ID>)],cX,cX,cX,cX:Field2[(<Field ID>)],cX,cX,cX,cX] [LWM=Y/N/ONLY] [LWM_FIELDS=field1,weight1[,...,fieldn,weightn]] [LWM_LIMIT=rejectscore[,acceptscore]]
Control
| Mandatory/Optional
| Description
|
---|---|---|
PURPOSE
| Mandatory
| Specifies the name of the matching purpose to use in a match call. The PURPOSE control uses the following syntax:
PURPOSE=(<expression>)
Use one of the following formats for
<expression> :
In the
ssan3_match section of the SSA-NAME3 Workbench, under
Mandatory Controls , you can find a list of the supported purpose names. With most standard populations, you can use one of the following predefined purposes:
For more information about each purpose and its required and optional fields, from the
Help menu of the SSA-NAME3 Workbench, click
Population Documentation .
The simple form of
<expression> is
PURPOSE=<Purpose_Name> .
For example,
PURPOSE=Address performs matching on the specified address fields.
|
MATCH_LEVEL
| Optional
| Controls the match level. With most standard populations, you can use one of the following predefined match levels:
The default value is Typical.
Use the
MATCH_LEVEL control to adjust the predefined accept or reject score limits that affect the match decisions.
The
MATCH_LEVEL control uses the following guidelines:
The
MATCH_LEVEL control is not applicable to the
Geocode purpose. If you specify the
MATCH_LEVEL control with a
Geocode purpose, SSA-NAME3 ignores the
MATCH_LEVEL control.
You can use one of the following match level expressions:
|
ADJWEIGHT
| Optional
| Adjusts the weight of a single field in a match purpose up or down relative to the other fields in the purpose. Use the following format:
ADJWEIGHT=field{+/-}n
The
field is a valid field name in the purpose that you define in the
PURPOSE= control, and
n is a single-digit number.
Standard populations use weights that are less than 10, so use smaller
ADJWEIGHT adjustments. For example:
In this example, the definition increases the weight of the
Person_Name field in the
Individual match purpose by 2. The definition increases the importance of the
Person_Name field and decreases the importance of the other fields. Experiment with different values and across a representative sample set of records to gauge the overall effect of this setting.
|
NAMEFORMAT=L/R
| Optional
| Defines whether the major word in the name or address is on the left end or the right end. For example, in Western person names, the family name is on the right end of the names.
Use the
NAMEFORMAT control to override the default name format. For more information about the default name format for a given standard population field and its effects, from the
Help menu of SSA-NAME3 Workbench, click
Population Documentation .
You can provide extra weight to match the major words in the
Person_Name field in some purposes such as
Person_Name ,
Household ,
Family , and
Wide_Household .
The
NAMEFORMAT control in a
match call overrides the name format for all name and address fields in the match purpose, so use the
NAMEFORMAT control with caution.
|
EARLYEXIT=Y/N
| Optional
| By default, SSA-NAME3 matching includes logic to check the score of the first field in a Purpose. If the score is low enough to cause a reject of the entire record, it takes an early exit from the match process. It results in a potential performance improvement, and the performance improves when it rejects more records. When an early exit is taken, the record score is set to 0. To preserve the real score of "unmatched" records (for example for analysis in the Developer’s Workbench), you can disable the early exit logic by specifying
EARLYEXIT=N .
If you disable the early exit logic, the lightweight matching is disabled even though you have enabled the lightweight matching by configuring
LWM=Y .
|
UNICODE_ENCODING
| Optional
| Instructs SSA-NAME3 to accept Unicode data input, and specifies the Unicode format of the data that you want to pass.
|
ENCODING
| Optional
| Functions similar to the
UNICODE_ENCODING keyword.
|
DELIMITER
| Optional
| Overrides the default delimiter - asterisk (*) if you pass the key field data by using the Tagged Data Format.
|
MATCH_OPTIONS
| Optional
| Defines options for a specific field. The
MATCH_OPTIONS control uses the following syntax:
MATCH_OPTIONS=(fieldname:option,...)
Use the
MATCH_OPTIONS control for the following fields:
|
MIN_MATCH_FIELDS
| Optional
| Specify the minimum number of field pairs that must be supplied. If a field is supplied for both the search or file record then it is counted.
MIN_MATCH_FIELDS=number
|
SEARCH and FILE
| Optional
| Use the SEARCH and FILE controls to specify the field names and their offset and length pairs in the scatter or gather format for the data in the
Search Data and
File Data fields. If you do not specify the offset and length pairs for the field names, SSA-NAME3 considers the data in the tagged data format.
You can extend any of the key fields and set the weight for the extended fields. The extended fields use the algorithm of the key fields. Use the extended fields to override the weight of the key fields in the run time. You can also specify the type of matching to perform between the data in the
Search Data and
File Data fields.
You can use the following matching types:
If you specify different matching types in the SEARCH and FILE parameters, the matching type that you specify in the SEARCH parameter takes precedence.
Exact Matching
Use the following format to perform exact matching:
PURPOSE=<Purpose Name> SEARCH=<Field Name>[(<Field ID>;<Weight>;Exact[:B|Z|ZB])],<Offset>,<Length> FILE=[(<Field ID>;<Weight>;Exact)],<Offset>,<Length>
If you want to extend a key field, use any number as the
Field ID value. If you do not want to extend any key field, use 0 as the
Field ID value. If you extend a key field in the search or file data, you must extend the corresponding key field in the search or file data.
The value
B indicates a null value, the value
Z indicates a zero value, and the value
ZB indicates a null or zero value.
The following sample expressions perform exact matching between the search data and file data values:
Reverse Matching
Reverses the original score. Use the following format to perform reverse matching:
PURPOSE=<Purpose Name>
SEARCH=<Field Name>[(<!Field ID>;<Weight>;)],<Offset>,<Length>
FILE=<Field Name>[(<!Field ID>;<Weight>;)],<Offset>,<Length>
The exclamation mark symbol (!) symbol preceding the numerical value for
Field ID triggers score reversal. The exclamation mark symbol (!) is mandatory for the
SEARCH field and optional for the
FILE field.
The following sample expression performs reverse matching of the organization name:
PURPOSE=Fields SEARCH=Organization_Name(!0),0,11,Person_name,11,14
FILE=Organization_Name(!0),0,11,Person_name,11,14
InformaticaJohn Smith
InformaticaJohn Smith
For the preceding sample expression, you get a reverse score of 0% for
Informatica instead of the usual score of 100%.
Fuzzy Matching
Use the following format to perform fuzzy matching:
SEARCH=<Field Name>[(<Field ID>;<Algorithm Name>[:<Upper Score Limit>;<Lower Score Limit>] FILE=<Field Name>[(<Field ID>;<Algorithm Name>[:<Upper Score Limit>;<Lower Score Limit>]
The format uses the following parameters:
The following sample expressions perform fuzzy matching between the search data and file data values:
Inexact Matching
Use the following format to perform inexact matching:
PURPOSE=<Purpose Name> SEARCH=<Field Name>[(<Field ID>;<Weight>;!Exact[:B|Z|ZB])],<Offset>,<Length> FILE=[(<Field ID>;<Weight>;!Exact)],<Offset>,<Length>
If you want to extend a key field, use any number as the
Field ID value. If you do not want to extend any key field, use 0 as the
Field ID value. If you extend a key field in the search or file data, you must extend the corresponding key field in the search or file data.
The value
B indicates a null value, the value
Z indicates a zero value, and the value
ZB indicates a null or zero value.
The following sample expressions perform inexact matching between the search data and file data values:
Range Matching
Use one of the following formats to perform range matching on numeric fields:
Use one of the following formats to perform range matching on date fields:
|
FMTOVERRIDES
| Optional
| Specifies whether to disable or override a list of category names and category types.
Use the following format:
PURPOSE=<Purpose Name>
SEARCH=<Field Name>[(<Field ID>;<Weight>;)],<Offset>,<Length>
FILE=<Field Name>[(<Field ID>;<Weight>;)],<Offset>,<Length>
FMTOVERRIDES=<Field Name1>[(<FieldID>)],cX,cX,cX,cX:<Field Name2>[(<Field ID>)],cX,cX,cX,cX]
The format uses the following values:
The following sample configuration disables nick name words:
PURPOSE=Fields SEARCH=Person_name,0,4 FILE=Person_name,0,4
FMTOVERRIDES=Person_Name(1),NRRNNNK,TS:Person_Name(0),NRRNNNK
Mick
Mike
For the preceding sample configuration, you get a score of 75% as SSA-NAME3 disables the
NK category name in the edit-list.
|
FILTER_SEARCHVALUES
| Optional
| Use the FILTER_SEARCHVALUES control to specify a list of values to match with the data in the Search Data field, File Data field, or both the fields. Use the Filter purpose to specify the data in the Search Data and File Data fields.
The FILTER_SEARCHVALUES control uses the following format to specify a list of search values:
For example:
The FILTER_SEARCHVALUES control uses the following parameters:
You can also perform inexact matching to get 100% score if the list values do not match with the search or file data.
To perform inexact matching, use the following format when you define the Filter purpose:
PURPOSE=(NOT Filter<Number>, NOT Filter<Number>,...)
The usage of NOT indicates that you want to perform inexact matching. For example, the
PURPOSE=(NOT Filter2) FILTER_SEARCHVALUES=2,SP,(WILLIAM,BILL) expression returns 100% score if the list values do not match with the search data.
|
LWM=Y/N/ONLY
| Optional
| Enables or disables lightweight matching. Use the value
Y to enable lightweight matching. Lightweight matching uses a fast score estimate to reject the obvious mismatches. The records that lightweight matching passes go to the full scoring for robust scoring and ranking. SSA-NAME3 returns the full score and the decision to the caller.
If you create system definition files by using the SDF Wizard, the lightweight matching is enabled by default.
Use the value
N to disable lightweight matching. SSA-NAME3 matching performs full scoring on all the matching records.
Use the value
ONLY to enable lightweight matching and disable full scoring. Lightweight matching returns the estimate as the final score to the caller.
|
LWM_FIELDS
| Optional
| Specifies the fields to which you want to apply lightweight matching and their weights. These values override the values that you have defined in the match purpose during the run time. Based on the lightweight matching scores, SSA-NAME3 rejects the obvious mismatches. If you do not set any value, SSA-NAME3 retrieves the fields from the match purpose and assigns equal weight to them.
The syntax of the LWM_FIELDS control is as follows:
LWM_FIELDS=<field1>,<weight1>[,...,<fieldn>,<weightn>]
where
field is a valid field name that you have defined in the Purpose control, and
weight is the relative significance of the specified field (0-100) when compared to the other fields.
For example,
LWM_FIELDS=Person_Name,5,Address_Part1,1
Lightweight matching is useful when you apply it to the fields that have low variations such as addresses. Lightweight matching is not efficient for the fields with high variations, where SSA-NAME3 handles the variations through Edit-list, and lightweight matching might incorrectly reject the records.
|
LWM_LIMIT
| Optional
| Specifies the accept and reject limits for the lightweight matching score. Based on the limits, SSA-NAME3 accepts or rejects the search results.
The syntax of the LWM_LIMIT control is as follows:
LWM_LIMIT=<Reject>[,<Accept>]
where
Reject and
Accept are the integer values ranging from 0 through 100.
For example,
LWM_LIMIT=50,90
If
LWM=N , the
LWM_LIMIT control has no effect.
If
LWM=Y , SSA-NAME3 rejects the lightweight matching scores that are less than the reject limit. The accept limit has no effect, and you can omit it.
If
LWM=ONLY , SSA-NAME3 rejects the lightweight matching scores that are less than the reject limit. It accepts the scores that are greater than the accept limit. It marks the scores of the records that are greater than or equal to the reject limit and less than the accept limit as undecided.
The default reject limit is 65, and the default accept limit is 90. If you have not set the accept limit and the reject limit is greater than 90, the accept limit is equal to the reject limit.
|