User Guide

10.5 HotFix 2
- 10.5 HotFix 3
- 10.5 HotFix 1
- 10.5
- 10.2 HotFix 1
- 10.2
- 10.1
- 10.0 HotFix 1
- 10.0

Back Next

Batch Search Client - DupFinder

The DupFinder function is a batch search application designed to discover duplicate records within data previously loaded into the Clustering Database. It does so by using each record in the Clustering Database as a search transaction against the same database. It uses the nominated Search Definition to find duplicate records from the Clustering Database and writes the search results to a flat file.

Because every search transaction will have an identical record on the file, the report will display such matches unless the correct run-time option is used to remove the Source Record (see below).

Starting from the Console

DupFinder can be started from the Console Client by selecting

Tools

DupFinder

. This brings up the

DupFinder

options screen:

Field	Description
Output File	All duplicate records that have an acceptable score as determined by the Search Definition are written to the output file.
Search Definition	You must choose the Search Definition to be used from this drop-down list.
Search Width	If you have predefined search widths ( Narrow , Typical or Exhaustive ) you can choose one here. Otherwise, if left blank, the control defined in the relevant search is used.
Match Tolerance	If you have predefined match tolerances ( Conservative , Typical or Loose ) you can choose one here. Otherwise, if left blank, the control defined in the relevant search is used.
Output Format	Choose the output report format from here. Values 0 - 7 are valid and are described in the Relate - Report Formats section.
Starting Record ID	Enables commencement of the deduplication process at a nominated Record ID value.
Extra Options	This field can be used to enter extra command line switches supported by future versions of the DupFinder program. See the Extra options for Relate and/or DupFinder section below for more information.
Return Search Records Only	Only return the Search Record for which a match was found.
Remove Search Record	By implication of matching the same file against itself, the report will show matches caused by identical records. This is probably not desired so the search record can be hidden.
Append New Line	Append a newline to the output report after each record. This option has effect only on report formats 0, 1, 3, 4 and 6. Without specifying this option all the output records are written into a single line and the output should be treated as fixed length records.
Trim Trailing Blanks	Remove trailing blanks from each output record. This option has effect only on report formats 0, 3, 4 and 6. This option also implies Append New Line so that the boundaries between the output records are not lost.

Rename Saved Search

Table of Contents

User Guide

User Guide

Batch Search Client - DupFinder

Batch Search Client - DupFinder

Starting from the Console