The load clustering job loads the output files of an initial clustering job from HDFS into the repository. Before you run the load clustering job, you can run the region splitter job to identify the split points for the input data.
You can also run the load clustering job in the incremental mode to load the incremental data into the repository.
The following image shows how the load clustering job loads the data into the repository:
The load clustering job performs the following tasks:
Reads the linked data from the output files of an initial clustering job in HDFS.