Table of Contents

Search

  1. Preface
  2. Introduction to Informatica MDM - Relate 360
  3. Linking Batch Data
  4. Tokenizing Batch Data
  5. Processing Streaming Data
  6. Creating Relationship Graph
  7. Loading Linked and Consolidated Data into Hive
  8. Searching Data
  9. Monitoring the Batch Jobs
  10. Troubleshooting
  11. Glossary

User Guide

User Guide

Load Clustering Job

Load Clustering Job

The load clustering job loads the output files of a HDFS tokenization job from HDFS into the repository. Before you run the load clustering job, you can run the region splitter job to identify the split points for the input data.
The following image shows how the load clustering job loads the data into the repository:
The load clustering job reads the output files of a HDFS tokenization job from HDFS and loads the tokenized data into the repository.
The load clustering job performs the following tasks:
  1. Reads the data from the output files of a HDFS tokenization job in HDFS.
  2. Loads the tokenized data into the repository.

0 COMMENTS

We’d like to hear from you!