Preface
Introduction to Informatica MDM - Relate 360
Linking Batch Data
Tokenizing Batch Data
Processing Streaming Data
Creating Relationship Graph
Loading Linked and Consolidated Data into Hive
Searching Data
Monitoring the Batch Jobs
Troubleshooting
Glossary

User Guide

Back Next

Load Clustering Job

The load clustering job loads the output files of an initial clustering job from HDFS into the repository. Before you run the load clustering job, you can run the region splitter job to identify the split points for the input data.

You can also run the load clustering job in the incremental mode to load the incremental data into the repository.

The following image shows how the load clustering job loads the data into the repository:

The load clustering job reads the output files of the initial clustering job from HDFS and loads the linked data into the repository.

The load clustering job performs the following tasks:

Reads the linked data from the output files of an initial clustering job in HDFS.

Loads the linked data into the repository.

Linking Data and Persisting the Linked Data in a Repository

Running the Load Clustering Job

Download Guide

Watch

Comments

Communities

Knowledge Base

Success Portal

0 COMMENTS

We’d like to hear from you! Log in to comment.

Rename Saved Search

Table of Contents

User Guide

User Guide

Load Clustering Job

Load Clustering Job