Configuring the Matching Rules
You can define one or more matching rules for an index in the matching rules file. You can define the matching rules within the
To define a matching rule, add the following parameters to the
section within the
- Includes a matching rule and its properties. Use multiple
MatchConfiguration sections within the
MatchRuleSet section to configure multiple rules.
- You can set the following properties for a matching rule:
- MatchRuleID. Unique name for the match rule. The name cannot exceed 14 characters.
- AutoMergeInd. Indicates whether to merge the matching records manually or automatically. Set to Yes for automatic merging, and set to No for manual merging.
- Required for fuzzy matching. Type of purpose that you want to use for matching. You can use one of the following standard SSA-NAME3 purposes:
- Address. Identifies an address match.
- Contact. Identifies a contact within an organization at a specific location.
- Division. Identifies an organization at an address.
- Fields. Identifies generic data.
- Household. Identifies individuals with same or similar family names who share the same address.
- Individual. Identifies a specific individual by name, ID, or date of birth.
- Organization. Identifies an organization by name.
- Person_Name. Identifies a person by name.
- Resident. Identifies a person at an address.
- Wide_Contact. Identifies a contact within an organization.
- Required for fuzzy matching. Level of matching that you want to perform. Use one of the following values:
- Typical. Returns more results that the conservative match level and less results than the loose match level.
- Conservative. Returns almost accurate results, and you can use in environments where the accuracy of a match is important.
- Loose. Returns matches with more variations than typical match, and you can use in environments where you can manually review the results.
- Required for fuzzy matching. Minimum match score to consider a record as a matching record.
- Lower Threshold
Optional. Minimum match score to consider a record as a probable matching record. If a match score is between the threshold and lower threshold values, the record is considered as a probable matching record.
If you do not specify a lower threshold value, the matching process does not identify any probable matching records.
- Maps the SSA-NAME3 fields with the input record fields and sets the properties for each field. You can set the following properties for each field:
- name. Indicates the SSA-NAME3 field for the input record field.
type. Indicates the type of matching to perform on the field. Set to Fuzzy to perform fuzzy matching, and set to Exact to perform exact matching on the field values.
- segment_ind. Optional. Indicates whether you want to enable segment matching. Use the segment matching to match the records that contain any of the specified values for a field. Set to 1 to enable segment matching and 0 to disable segment matching. Default is 0.
If you enable segment matching for a field, the matching process ignores the null matching or non-equal matching configuration.
segment_val. Optional. List of values based on which you want to perform segment matching. Specify the
segment_val parameter only when you enable segment matching.
The following sample code configures segment matching for the
<MField name="City" type="Exact" segment_ind="1" segment_val="New York,Tokyo">City</MField>
The preceding sample code matches the records that have New York or Tokyo as the
If you want to include null to the list of values, add a comma before the first segment value. For example,
null_ind. Optional. Indicates how the matching process must handle null values. You can set the null_ind property to one of the following values:
- 0. Does not match a null value with any other values. Default is 0.
- 1. Considers only two null values as a match and does not match any other values. If you want the match rule to match other values, create another similar rule and set the null_ind property to 0.
- 2. Considers only a null value and a non-null value as a match and does not match any other values. If you want the match rule to match other values, create another similar rule and set the null_ind property to 0.
During the linking process, two matched records are linked to the same cluster. When you set null_ind=2 and if one of the linked records contains a null value, the matching process excludes it from further matching.
For example, consider the following records:
- 100 John Smith Redwood City
- 200 John Smith
- 300 John Smith Las Vegas
- 400 John Smith Toronto
When you set null_ind=2, record 100 matches with record 200, and the records then link to cluster 1. The matching process excludes record 200 from further matching. Record 300 and record 400 do not match with record 100, and separate clusters are created for record 300 and record 400.
If you enable segment matching, the null matching configuration is ignored.
anti_ind. Optional. Indicates whether you want to perform non-equal matching. Use the non-equal matching to prevent equal values of a field from matching each other and return a successful match only when the values do not match. Set to 1 to enable non-equal matching and 0 to disable non-equal matching. Default is 0.
If you enable segment matching, the non-equal matching configuration is ignored.
- Optional. Additional attributes that you want to specify. You can specify the following attributes:
- NAMEFORMAT=L|R. Indicates whether the major word in a name or address is on the left end or the right end. For example, in Western names, the family name is on the right end of the names.
- UNICODE_ENCODING. Specifies the Unicode format of the data that you use.
You can define additional
MatchConfiguration sections within the
MatchRuleSet section to define multiple matching rules.
The following sample shows a matching rule definition named Rule1:
<MatchConfiguration MatchRuleID="Rule1" AutoMergeInd="yes">
<MField name="ID" type="Exact" segment_ind="1" segment_val="M">PersonSSN</MField>
<MField name="Person_Name" type="Exact" null_ind="1" anti_ind="0">PersonFirstName</MField>
<MField name="Person_Name" type="Exact" null_ind="1" anti_ind="0">PersonLastName</MField>
<MField name="Address_Part1" type="Fuzzy">ShippingAddress</MField>
<MField name="Telephone_Number" type="Exact">ShippingTelephoneNumber</MField>