Table of Contents

Search

  1. Preface
  2. Introduction
  3. Installation
  4. Design
  5. Operation

Membership Attributes

Membership Attributes

The clustering process creates Cluster membership records. Each record contains status information about the "membership" of a data record in a particular cluster.
The layout of the fields used to store membership information is documented in the
Customized Format Definition Syntax
section.
A member in a cluster is assigned certain attributes that determine how it behaves during the clustering process:
  • Voting / Non-Voting
  • Undecided
Attributes are set when a membership record is created. Attributes are used while determining new members for a cluster.
Some of the Attributes are also present in the indexes. This allows efficient processing of the database while selecting records for a nominated set of membership attributes. This subset is referred to as the Selection Attributes.

Voting Attribute

A cluster member can be either Voting or Non-Voting. If it has voting rights, it can be used in the scoring process when new records are being matched to the cluster it participates in. Non-voting members do not participate in the scoring process and therefore can not attract new members to the cluster. The following Job Options are used to control the Voting rules.
Set Attributes
  • SET-ALL-VOTE
    - all members, which are added, can vote
  • SET-NONE-VOTE
    - all members, which are added, can not vote
  • SET-HEADERS-VOTE
    - only the founding member of the cluster can vote (one per cluster)
Use Attributes
USE-ATTRIBUTES
honour the voting attribute
Defaults
The default is
SET-ALL-VOTE
,
--USE-ATTRIBUTES
meaning do not honour the attributes and set all new members as Voting. To enable the use of attributes, specify
USE-ATTRIBUTES
.
When running
CLUSTERING-METHOD=MANY
without specifying the job option
NO-ADD
(ie. we are adding new records), some restrictions are enforced:
SET-ALL-VOTE
can not be specified and neither can
--USE-ATTRIBUTES
. If you explicitly specify these parameters an error message will be produced. It is assumed that you have seeded or preclustered a database and then want to run the
MANY
clustering to add new records.
Post
You may select output records for the
POST
phase based on the Attributes by specifying the following
Output-Options
:
VOTING
means report only Voting records.
--VOTING
means report only Non-Voting records.

Undecided Attribute

When a record is found to score below the
Accept
limit but not below the
Reject
limit then it is added to the Cluster as a Non-Voting member. It is also marked with the Undecided Attribute. When processing Non-Voting (or all records), records with this Attribute will be retrieved.
This Attribute can be used to determine how a record came to be added as a Non-Voting member:
  • an undecided record has the Undecided attribute,
whereas,
  • records added with a
    Set-None-Vote
    option, or
  • a non-header record added with the
    Set-Headers-Vote
    option
do not have this attribute.

0 COMMENTS

We’d like to hear from you!