Table of Contents

Search

  1. Preface
  2. Introduction
  3. Defining a System
  4. Flattening IDTs
  5. Link Tables
  6. Loading a System
  7. Static Clustering
  8. Simple Search
  9. Search Performance
  10. Miscellaneous Issues
  11. Limitations
  12. Error Messages

Multi-Search Definition

Multi-Search Definition

Use the
MULTI-SEARCH-DEFINITION
section to define parameters for a cascade of searches that
Identity Resolution
runs.
The following excerpt is a sample multi-search definition:
multi-search-definition *====================== NAME= multi_svoc_person IDT-NAME= idt_svoc3 SEARCH-LIST= search_svoc_person, search_svoc_ssn OPTIONS= Full-Search
Begin the definition with the
MULTI-SEARCH-DEFINITION
keyword, and use the following parameters:
NAME=
Required. Unique identifier for the multi-search definition. The name must not match any loader definition or search definition names in a system.
COMMENT=
Brief description about the multi-search.
SEARCH-LIST=
List of searches to run. Separate the searches by commas, and enclose the whole string in double quotes. Run all the searches against the same identity table. You can specify a maximum of 16 searches.
When you use the statistical output view field
IDS-MS-SEARCH-NUM
, the returned result refers to the search names in the list. For example, the value 1 refers to the first search in the list and value 2 refers to the second search in the list.
IDT-NAME=
Required. Identifier for the identity table against which you want to perform the multi-search.
IDL-NAME=
Optional. Identifier for the link table that stores the search results.
SOURCE-DEDUP-MAX=
Optional. Number of records to cache. A multi-search caches the records that a search processes and uses the results when another search in the search list retrieves them as candidates. This option helps only if all the searches use the same
Score-Logic
. The default value of 0 disables this option.
DEDUP-PROGRESS=
Optional. Maximum number of identity table records to process to find duplicate records before returning the control to a client process. The DupFinder process treats each identity record as a search record instead of reading search records provided by a client. The search process does not return control to the client until it finds a duplicate record. The process might take a long time if you have few duplicate records in the identity table. The client uses the time interval to write progress messages. The default value of 0 disables this option.
OPTIONS=
A comma-separated list of keywords that control the multi-search process.
You can use the following options:
  • FULL-SEARCH. Processes all the searches defined in the search list.
  • FULL-SEARCH-RESTRICTED. Restricts the multi-search process from executing redundant searches in the search list to improve performance. You can use the FULL-SEARCH-RESTRICTED option only when a search limit is defined for a search.
    For example, consider the following searches defined in a multi-search definition:
    First search – Typical name search with search limit Second search – Narrow name search Third search – Date search Options = Full-Search-Restricted
    You can use the FULL-SEARCH-RESTRICTED option to restrict the multi-search process from executing redundant or unnecessary searches and move to the subsequent searches in the following scenarios:
    • If the first search returns records, the second search might return a subset of the records that were already returned by the first search. In this scenario, the FULL-SEARCH-RESTRICTED option executes the third search in the list and prevents the execution of the second search.
    • If the first search doesn't return any records, the second search is also unlikely to return any records. In this scenario, the FULL-SEARCH-RESTRICTED option executes the second search if the search limit defined for the first search prevented the execution of the first search.
    If you specify both the FULL-SEARCH and FULL-SEARCH-RESTRICTED options, the multi-search process executes all the searches in the search list until it encounters another search with a search limit.
  • LINK-PK. Specifies that the link table contains primary key columns in addition to the normal data.
  • LINK-HARD-LIMIT. Specifies that the link table does not contain links for the rejected candidates.
  • LINK-SELF. Specifies that the link table contains links for the rows that found themselves.
  • CONVENTIONAL-PATH. Specifies that the link table uses the Conventional-Path option to load.

0 COMMENTS

We’d like to hear from you!