Table of Contents

Search

  1. Preface
  2. Introduction to Informatica MDM - Relate 360
  3. Linking Batch Data
  4. Tokenizing Batch Data
  5. Processing Streaming Data
  6. Creating Relationship Graph
  7. Loading Linked and Consolidated Data into Hive
  8. Searching Data
  9. Monitoring the Batch Jobs
  10. Troubleshooting
  11. Glossary

User Guide

User Guide

Consolidation Job

Consolidation Job

Use the consolidation job to consolidate the linked data and create a preferred record for each cluster in HDFS. The consolidation job uses the output files of an initial clustering job in HDFS as input. The consolidation job creates the preferred records based on the rules defined in the consolidation rules file.
For incremental data, use the output files of an initial clustering job that you run in the incremental mode with the
--consolidate
option as input for the consolidation job.
The consolidation process ignores any null values.
The following image shows how the consolidation job processes the input data:
The consolidation job reads the output files of an initial clustering job in HDFS and creates preferred records for all the clusters in HDFS.
The consolidation job performs the following tasks:
  1. Reads the output files of an initial clustering job in HDFS.
  2. Creates preferred records for all the clusters based on the rules defined in the consolidation rules file.

0 COMMENTS

We’d like to hear from you!