Table of Contents

Search

  1. Preface
  2. Installing Informatica MDM - Relate 360
  3. Configuring Relate 360
  4. Configuring Security
  5. Setting Up the Environment to Process Streaming Data
  6. Configuring Distributed Search
  7. Packaging and Deploying the RESTful Web Services
  8. Troubleshooting

Installation and Configuration Guide

Installation and Configuration Guide

Configuring Metadata

Configuring Metadata

You must configure the metadata information, such as the name of the column that you want to set as primary key, in the configuration file.
To configure the metadata information, add the following parameters to the MetaData section in the configuration file:
PK
Name of the column that you want to set as primary key.
SOURCE_COLUMN_NAME
Name of the column to store the source information of the data.
You can use only LMT_SOURCE_NAME as the column name.
You can also use the part_of_layout attribute to specify whether the source information is part of the input data. Set to YES if the source information is part of the input data, and set to NO if the source information is not part of the input data. If you set to NO, ensure that you specify the SOURCE_NAME parameter.
For example: <SOURCE_COLUMN_NAME part_of_layout="YES">LMT_SOURCE_NAME</SOURCE_COLUMN_NAME>
SOURCE_NAME
Name of the source for the input data. If the input data does not contain the source name, use the SOURCE_NAME parameter to specify the source name. The source name cannot exceed 32 bytes.
For example: <SOURCE_NAME is_reference="YES">PRIZM</SOURCE_NAME>
CLUSTER_COLUMN_NAME
Name of the column on which you want to store the link identifiers.
CLUSTER_COLUMN_SIZE
Size of the column that stores the link identifiers. Use 40 bytes as the column size.
CLUSTER_OPTION
Indicates whether you want to delete the tables that the load job created in the repository when you run the load job again.
Set to true if you want to delete the tables, and set to false if you want to append the data to the existing tables. Default is false.
MATCHSOURCES
Specifies the source of the data that you want the MapReduce jobs to use.
For example:
<MATCHSOURCES>
        <MATCHSOURCE>ORACLE</MATCHSOURCE>
        <MATCHSOURCE>MYSQL</MATCHSOURCE>
        <MATCHSOURCE>SAP</MATCHSOURCE>        
 </MATCHSOURCES>
The previous example specifies to include data only from Oracle, MySQL, and SAP for the MapReduce jobs to process.
ALTERNATETABLEFORGROUPINFO
Optional. Indicates whether you want to have a separate table to store the link information.
Set to true if you want to have a separate table, and set to false if you do not want to have a separate table. If the value is false, ensure that you specify the ADDGROUPNUMBERTOROWKEY parameter. Default is false.
ADDGROUPNUMBERTOROWKEY
Indicates whether you want to add the link number to the record key.
If you set ALTERNATETABLEFORGROUPINFO=false, set AddGroupNumberToRowKey=true. Default is false.
DELETEBATCHSIZE
Optional. Indicates the total number of records that you can delete at once. Default is 1000.
PARTITION_COLUMN_NAME
Optional. Indicates the name of the column based on which you can create a partition identifier and add the partition identifier to the key. Use the following attributes to define the partition identifier:
  • length. Defines the length of the column that you can add as the prefix to the keys. The maximum length of the column that you can use is 8 bytes. Default is 2 bytes.
  • part_of_rowkey. Indicates whether the key includes the partition identifier. Set to true if you want to prefix the key with the partition identifier, and set to false if you do not want to prefix the key with the partition identifier.
For example:
<PARTITION_COLUMN_NAME length="2" part_of_rowkey="YES">STATE</PARTITION_COLUMN_NAME>
If you configure the PARTITION_COLUMN_NAME parameter for the initial linking job, you must configure the PARTITION_COLUMN_NAME parameter when you run other jobs to update or increment the initial data.
LinkTableName
Base name for the tables that the initial loading job creates in the repository. The initial loading job uses the following format for the table names:
MDMBDRM<OrganizationID>_<LINKTABLENAME>_<PK|GROUP>
For example, if you specify LINKTABLENAME=LMT_MATCHED, the initial loading job creates an index table named MDMBDRM<OrganizationID>_LMT_MATCHED, a primary key table named MDMBDRM<OrganizationID>_LMT_MATCHED_PK, and a link table named MDMBDRM<OrganizationID>_LMT_MATCHED_GROUP.
StoreAllFields
Optional. Indicates whether to persist all the columns that you define in the PZMAP section in the repository.
Set to true if you want to persist all the columns in the repository. Set to false if you want to persist only the columns that you use to index data in the repository. Default is false.
If you plan to run the initial linking, initial loading, incremental linking, update linking, or repository data deletion job with the matching rules file, you must set StoreAllFields=true.
ColumnFamilyName
Name of the column family that groups all the columns in the repository table.
MaxConcurrentSessions
Maximum number of REST requests that you can run concurrently. A higher number improves search performance but uses more memory. You can configure the value based on the amount of available memory. Default is 200.
The following sample code shows the metadata configuration:
<MetaData>
   <PK>ROWID</PK>
   <SOURCE_NAME is_reference="YES">PRIZM</SOURCE_NAME>
   <SOURCE_COLUMN_NAME part_of_layout="YES">LMT_SOURCE_NAME</SOURCE_COLUMN_NAME>
   <CLUSTER_COLUMN_NAME>GROUPNO</CLUSTER_COLUMN_NAME>
   <CLUSTER_COLUMN_SIZE>40</CLUSTER_COLUMN_SIZE>
   <CLUSTER_OPTION deleteLinkTable="TRUE" />
   <MATCHSOURCES>
      <MATCHSOURCE>ORACLE</MATCHSOURCE>
      <MATCHSOURCE>MYSQL</MATCHSOURCE>
      <MATCHSOURCE>SAP</MATCHSOURCE>      
   </MATCHSOURCES>
   <ALTERNATETABLEFORGROUPINFO>false</ALTERNATETABLEFORGROUPINFO>
   <DELETEBATCHSIZE>1000</DELETEBATCHSIZE>
   <PARTITION_COLUMN_NAME length="2" part_of_rowkey="YES">STATE</PARTITION_COLUMN_NAME>
   <LinkTableName>MDM_INDIVIDUAL_LMT_GA</LinkTableName>
   <ColumnFamilyName>MDMBDE_link_columns</ColumnFamilyName>
   <StoreAllFields>true</StoreAllFields>                        
   <AddGroupNumberToRowKey>true</AddGroupNumberToRowKey>                        
   <MaxConcurrentSessions>100</MaxConcurrentSessions>
</MetaData>


Updated June 27, 2019