The Merge Control Table (MCT) is a database table containing information relating to the merge process. Each record in the MCT relates to a single cluster. The table can be read to retrieve results of the merging process such as the record-id of the merged result row (stored in the IDT). The name of the MCT is constructed from the corresponding IDT name suffixed by ’_MC’. The columns of the MCT are described below.
RECID
is reserved for internal use.
IDS_CL_ID
contains the
ClusterId
(defined by
Persistent-ID-Prefix
in the
Persistent-ID
definition). See the
Persistent-ID-Prefix
section
IDS_CL_NUM
contains the
ClusterNumber
.
IDS_RESULT_ID
The record in the IDT which is the preferred/result record for the cluster.
IDS_STATUS
The current status of the cluster (see Cluster Status Types below).
IDS_SELECTED
A Binary field 512 bytes long where each 2 bytes relates to a column in the IDT. It specifies which record of the cluster was used in the preferred /result record for this column, with the first record being 0, the second 1 and so forth. A Value of 0xFFFF indicates that no record was selected for the result in the column.
IDS_FLAGS
This is a Binary field of length 256 bytes where each byte represents a column in the IDT. Each value signals information about the result for this column. There are at present 4 possible values:
0x0000 = Unique A single unique result was found.
0x0001 = Multiple The result for this column matches multiple records, each containing the same value.
0x0002 = Generated The result for this column is a generated value, such as an average or summation.
0x0004 = No Result The rules failed to find a result for this column.
0x0008 = NULL Result The result for this column is a NULL value.
IDS_COMMENT
This contains a comment about the cluster. Its length is 255 characters.
IDS_REVISION
Contains the revision count for the cluster.
IDS_CL_FLAGS
Contains a flag field for the cluster.
0x0001 = Inserted Indicated the preferred record is a generated record that has been inserted into the IDT.
Outline for using a MCT
You need to do the following:
Define your system with an appropriate search strategy for your clustering needs. Ensuring the multi-search has all the required fields for Persistent-ID work.
Create a corresponding Merge-Definition with Master and Column rules as desired. Ensure that you associate the multi-search with this definition by setting the option in the multi-search.
Create the system and load the IDT.
Run Synchronizer with the
--persist
to perform initial clustering, as well as create and populate the MCT.
Define an IDT in your System with a layout matching your clustered data. Specify the
IDT-ONLY
option since an IDX is not required.
Create a corresponding Merge-Definition with Master and Column rules as desired.
Load the system and the IDT. The Table-Loader will load rows into the IDT and create a corresponding MCT.
Use GUI Informatica Data Director to review the clusters. In some cases merge rules can return ambiguous results due to insufficient or conflicting rules. The Informatica Data Director may be used to manually construct the merge result in these cases. While reviewing clusters the user can also change the cluster memberships via either the merge or split functionality.
Once satisfied with the merge results, you may use SQL to select the merge results from the IDT (Using the
CL_ID
column in the IDT or the
Result Id
column in the MCT). Generally this will occur after a manual review using the Informatica Data Director when all clusters have been marked as