The Persistent-ID (PID) feature of MDM-RE facilitates the dynamic creation and maintenance of clusters (groups of related records) within an IDT. The clusters persist, since they are stored in an SQL-accessible table on the target database, and are dynamic, in the sense that they are maintained by the Synchronizer as updates to the IDT are processed.
The clustering relationships, known as memberships, are stored in a table named
IDT_Name_MB
, which is created when the first membership records need to be stored. Multiple clusterings may be stored in the same table, since each membership record is qualified by a two-character prefix known as the
ClusterId
, which uniquely identifies the clustering it belongs to.
Records within one cluster have the same
Cluster_Number
assigned to them. Note that cluster numbers may change when, for example, two clusters are merged or split. Thus, cluster numbers themselves do not persist. Instead, the relationships between rows will persist, and will be maintained as updates to the IDT occur.
Since cluster numbers may change, it is not advisable to propagate them to your source systems. If this is unavoidable, the copies must be maintained. As an aid, we provide an audit trail that records all changes made to cluster numbers. It is the user’s responsibility to use this information to maintain their copies. Alternatively, you can avoid the problem by accessing clusters using source Primary Keys, which are presumably stable. An API is provided to return all Source PKs in the same cluster, given one of the PK values.
In addition to building and maintaining relationships between rows of an IDT, each cluster may optionally have a master record, which represents the most accurate version of the data within a cluster.You need to refer to the Merge Rules section of this manual for details.
Once merge rules have been defined, the master records are created automatically by the Synchronizer. In some cases, it may be desirable to manually check the clustering result, and/or the master records that have been created. This process is performed with the Informatica Data Director, which has a graphical user interface. Master records are stored in the IDT and are identified by their non-NULL
ClusterId
. In fact, the value in the
ClusterId
column of the IDT is identical to the
ClusterId
stored in the Membership table, to link both of them to a particular clustering strategy.