Table of Contents

Search

  1. Preface
  2. Introduction
  3. Defining a System
  4. Flattening IDTs
  5. Link Tables
  6. Loading a System
  7. Persistent-ID (Dynamic Clustering)
  8. Cluster Governance
  9. Static Clustering
  10. Simple Search
  11. Search Performance
  12. Miscellaneous Issues
  13. Limitations
  14. Error Messages

Multi-Search Definition

Multi-Search Definition

Use the
MULTI-SEARCH-DEFINITION
section to define parameters for a cascade of searches that
MDM Registry Edition
runs.
The following excerpt is a sample multi-search definition:
multi-search-definition *====================== NAME= multi_svoc_person IDT-NAME= idt_svoc3 SEARCH-LIST= "search_svoc_person,search_svoc_ssn" OPTIONS= Full-Search PERSISTENT-ID-PREFIX= pe PERSISTENT-ID-METHOD= Merge PERSISTENT-ID-OPTIONS= Pre-Merge-Review, Audit-Cluster, Audit-Record, Apply-FLUL PERSISTENT-ID-MRULES= merge_svoc
Begin the definition with the
MULTI-SEARCH-DEFINITION
keyword, and use the following parameters:
NAME=
Required. Unique identifier for the multi-search definition. The name must not match any loader definition or search definition names in a system.
COMMENT=
Brief description about the multi-search.
SEARCH-LIST=
List of searches to run. Separate the searches by commas, and enclose the whole string in double quotes. Run all the searches against the same identity table. You can specify a maximum of 16 searches.
When you use the statistical output view field
IDS-MS-SEARCH-NUM
, the returned result refers to the search names in the list. For example, the value 1 refers to the first search in the list and value 2 refers to the second search in the list.
IDT-NAME=
Required. Identifier for the identity table against which you want to perform the multi-search.
IDL-NAME=
Optional. Identifier for the link table that stores the search results.
SOURCE-DEDUP-MAX=
Optional. Number of records to cache. A multi-search caches the records that a search processes and uses the results when another search in the search list retrieves them as candidates. This option helps only if all the searches use the same
Score-Logic
. The default value of 0 disables this option.
DEDUP-PROGRESS=
Optional. Maximum number of identity table records to process to find duplicate records before returning the control to a client process. The DupFinder process treats each identity record as a search record instead of reading search records provided by a client. The search process does not return control to the client until it finds a duplicate record. The process might take a long time if you have few duplicate records in the identity table. The client uses the time interval to write progress messages. The default value of 0 disables this option.
OPTIONS=
A comma-separated list of keywords that control the multi-search. Use the following options:
  • FULL-SEARCH. Processes all the searches in the list.
  • LINK-PK. Specifies that the link table contains primary key columns in addition to the normal data.
  • LINK-HARD-LIMIT. Specifies that the link table does not contain links for the rejected candidates.
  • LINK-SELF. Specifies that the link table contains links for the rows that found themselves.
  • CONVENTIONAL-PATH. Specifies that the link table uses the Conventional-Path option to load.
PERSISTENT-ID-PREFIX=
Two-character identifier for the clustering strategy. Use a different identifier for each clustering strategy that searches the same identity table. However, if you want to maintain the initial clusters that you create, use the same identifier for the best or merge method.
For example, use the pre-clustered method to create initial clusters with
AA
as the identifier. To maintain the clusters, use the best or merge method with
AA
as the identifier because the best or merge method must operate on the same clusters that the pre-clustered method creates.
PERSISTENT-ID-METHOD=
Name of the clustering method. Use the following methods:
  • best
  • merge
  • seed
  • pre-clustered (<Column Name>)
PERSISTENT-ID-OPTIONS=
Additional clustering options that you want to specify. You can use multiple options separated by commas. Use the following options:
  • Initial
  • NoNew
  • Pre-Merge-Review
  • Post-Merge-Review
  • Best-Undecided-Review
  • Preferred-Record-Review
  • Apply-FLUL
  • Audit-Cluster
  • Audit-Record
PERSISTENT-ID-MRULES=
Name of the merge definition that maintains the preferred records of the clusters. Specify this option to use Informatica Data Director to view the preferred records.

0 COMMENTS

We’d like to hear from you!