Table of Contents

Search

  1. Preface
  2. Installing MDM Big Data Relationship Management
  3. Configuring MDM Big Data Relationship Management
  4. Configuring Security
  5. Setting Up the Environment to Process Streaming Data
  6. Configuring Distributed Search
  7. Packaging and Deploying the RESTful Web Services

Installation and Configuration Guide

Installation and Configuration Guide

Configuring the Indexes

Configuring the Indexes

You can define one or more indexes in the matching rules file. You can define an index within the
MDMBDRMMatchRuleSet
section. You must create multiple
MDMBDRMMatchRuleSet
sections to define multiple indexes.
To define an index, add the following parameters to the
IndexingConfiguration
section within the
MDMBDRMMatchRuleSet
section:
indexFieldName
Name of the columns based on which you want to index the records. If you specify multiple columns, use commas to separate them.
Ensure that you specify the column names in the
PZMAP
section of the configuration file.
For example,
<indexFieldName>IDS_name,IDS_alias1,IDS_alias2</indexFieldName>
.
indexType
Type of index that you want to create. Use one of the following values:
  • FUZZY. A heavy index that contains fuzzy keys.
  • USER. A lightweight index that contains exact values from the field.
keyField
Name of the SSA-NAME3 field based on which you want to build keys.
keyLevel
Optional. Type of key level to build. Use one of the following values:
  • Standard. Builds more variations than limited key level but uses less disk space than extended key level.
  • Extended. Builds more variations than standard key level and uses more disk space than standard and limited key levels.
  • Limited. Builds less variations and uses low disk space than standard and extended key levels.
Default is Standard.
AdditionalControl
Optional. Additional attributes to configure. You can specify the following attributes:
  • NAMEFORMAT=L|R. Indicates whether the major word in a name or address is on the left end or the right end. For example, in Western names, the family name is on the right end of the names.
  • UNICODE_ENCODING. Specifies the Unicode format of the data that you use.
PARTITION_COLUMN_NAME
Optional. Indicates the name of the column based on which you can create a partition identifier and add the partition identifier to the key. Use the following attributes to define the partition identifier:
  • length. Defines the length of the column that you can add as the prefix to the keys. The maximum length of the column that you can use is 8 bytes. Default is 2 bytes.
  • part_of_rowkey. Indicates whether the key includes the partition identifier. Set to true if you want to prefix the key with the partition identifier, and set to false if you do not want to prefix the key with the partition identifier.
If you configure the
PARTITION_COLUMN_NAME
parameter for the initial linking job or initial clustering job, you must configure the
PARTITION_COLUMN_NAME
parameter when you run other jobs to update or increment the initial data.
The following sample shows an index definition for the PersonFullName column:
<IndexingConfiguration> <indexFieldName>PersonFullName</indexFieldName> <indexType>FUZZY</indexType> <keyField>Person_Name</keyField> <keyLevel>Standard</keyLevel> <AdditionalControl/> <PARTITION_COLUMN_NAME length="4" part_of_rowkey="Yes">ColumnName1</PARTITION_COLUMN_NAME> </IndexingConfiguration>


Updated June 27, 2019