Table of Contents

Search

  1. Preface
  2. Introduction
  3. Defining a System
  4. Flattening IDTs
  5. Link Tables
  6. Loading a System
  7. Persistent-ID (Dynamic Clustering)
  8. Cluster Governance
  9. Static Clustering
  10. Simple Search
  11. Search Performance
  12. Miscellaneous Issues
  13. Limitations
  14. Error Messages

Merge Control Table

Merge Control Table

The Merge Control Table (MCT) is a database table containing information relating to the merge process. Each record in the MCT relates to a single cluster. The table can be read to retrieve results of the merging process such as the record-id of the merged result row (stored in the IDT). The name of the MCT is constructed from the corresponding IDT name suffixed by ’_MC’. The columns of the MCT are described below.
  • RECID
    is reserved for internal use.
  • IDS_CL_ID
    contains the
    ClusterId
    (defined by
    Persistent-ID-Prefix
    in the
    Persistent-ID
    definition). See the
    Persistent-ID-Prefix
    section
  • IDS_CL_NUM
    contains the
    ClusterNumber
    .
  • IDS_RESULT_ID
    The record in the IDT which is the preferred/result record for the cluster.
  • IDS_STATUS
    The current status of the cluster (see Cluster Status Types below).
  • IDS_SELECTED
    A Binary field 512 bytes long where each 2 bytes relates to a column in the IDT. It specifies which record of the cluster was used in the preferred /result record for this column, with the first record being 0, the second 1 and so forth. A Value of 0xFFFF indicates that no record was selected for the result in the column.
  • IDS_FLAGS
    This is a Binary field of length 256 bytes where each byte represents a column in the IDT. Each value signals information about the result for this column. There are at present 4 possible values:
    1. 0x0000 = Unique A single unique result was found.
    2. 0x0001 = Multiple The result for this column matches multiple records, each containing the same value.
    3. 0x0002 = Generated The result for this column is a generated value, such as an average or summation.
    4. 0x0004 = No Result The rules failed to find a result for this column.
    5. 0x0008 = NULL Result The result for this column is a NULL value.
  • IDS_COMMENT
    This contains a comment about the cluster. Its length is 255 characters.
  • IDS_REVISION
    Contains the revision count for the cluster.
  • IDS_CL_FLAGS
    Contains a flag field for the cluster.
0x0001 = Inserted Indicated the preferred record is a generated record that has been inserted into the IDT.

Outline for using a MCT

You need to do the following:
  • Define your system with an appropriate search strategy for your clustering needs. Ensuring the multi-search has all the required fields for Persistent-ID work.
  • Create a corresponding Merge-Definition with Master and Column rules as desired. Ensure that you associate the multi-search with this definition by setting the option in the multi-search.
  • Create the system and load the IDT.
  • Run Synchronizer with the
    --persist
    to perform initial clustering, as well as create and populate the MCT.
  • Define an IDT in your System with a layout matching your clustered data. Specify the
    IDT-ONLY
    option since an IDX is not required.
  • Create a corresponding Merge-Definition with Master and Column rules as desired.
  • Load the system and the IDT. The Table-Loader will load rows into the IDT and create a corresponding MCT.
  • Use GUI Informatica Data Director to review the clusters. In some cases merge rules can return ambiguous results due to insufficient or conflicting rules. The Informatica Data Director may be used to manually construct the merge result in these cases. While reviewing clusters the user can also change the cluster memberships via either the merge or split functionality.
  • Once satisfied with the merge results, you may use SQL to select the merge results from the IDT (Using the
    CL_ID
    column in the IDT or the
    Result Id
    column in the MCT). Generally this will occur after a manual review using the Informatica Data Director when all clusters have been marked as
    Accepted
    .

0 COMMENTS

We’d like to hear from you!