Table of Contents

Search

  1. Preface
  2. Introduction to Informatica MDM - Relate 360
  3. Linking Batch Data
  4. Tokenizing Batch Data
  5. Processing Streaming Data
  6. Creating Relationship Graph
  7. Loading Linked and Consolidated Data into Hive
  8. Searching Data
  9. Monitoring the Batch Jobs
  10. Troubleshooting
  11. Glossary

User Guide

User Guide

Post-Clustering Job

Post-Clustering Job

Use the post-clustering job to read the output files of an initial clustering job in HDFS and process the input data based on the mode that you configure. The input data can be linked data or poor quality data.
You can run the post-clustering job in one of the following modes:
Skip
Skips the records in the high-volume clusters that contain more than the specified number of records.
Recluster
Re-links the records in the high-volume clusters that contain more than the specified number of records.
Longtail
Decrypts the poor quality records that the initial clustering job identifies to the original input format. You can cleanse the decrypted data, and use it as the input data for the initial clustering job.
Export
Exports the linked data in the CSV format.
The following image shows how the post-clustering job processes the input data in the skip, recluster, and longtail modes:
The post-clustering job processes the input data in HDFS and writes the processed data in HDFS.
The post-clustering job performs the following tasks:
  1. Reads the output files of an initial clustering job in HDFS.
  2. Processes the input data based on the mode that you configure.
  3. Writes the processed data in HDFS.
The following image shows how the post-clustering job processes the input data in the export mode:
The post-clustering job reads the input and output files of an initial clustering job data in HDFS and exports the linked data in the CSV format.
The post-clustering job performs the following tasks:
  1. Reads the input and output files of an initial clustering job in HDFS.
  2. Exports the linked data in the CSV format.

0 COMMENTS

We’d like to hear from you!