Batch Processing

Batch Processing

In the MDM Hub, a batch job is a program that completes a discrete unit of work when it runs. This discrete unit of work is called a process. Processes are multithreaded. Batch jobs can run in parallel on all the child base objects that are in the match path of the parent base object.
For example, you can use batch processing the first time you load business data into the Hub Store. Batch processing is the most efficient way to load a large number of records from source systems.
The data that you load from source systems goes through the following series of processes:
Step 1: Land
Transfers data from a source system that is external to the
MDM Hub
to landing tables in the Hub Store. Part of the reconciliation process described in Main Inbound Data Flow (Reconciliation).
Step 2: Stage
Retrieves data from the landing table, cleanses it, and copies it into a staging table in the Hub Store. Part of the reconciliation process.
Step 3: Load
Loads data from the staging table into the corresponding Hub Store table called the base object. Part of the reconciliation process.
Step 4: Tokenize
Generates match tokens in a match key table that the match process uses to identify candidate base object records for matching.
Step 5: Match
Compares records for points of similarity (based on match rules), determines whether records are duplicates, and flags duplicate records for consolidation. Part of the reconciliation process.
Step 6: Consolidate
Merges data in duplicate records to create a consolidated record that contains the most reliable cell values from the source records. Part of the reconciliation process.
Step 7: Publish
Publishes the best version of the truth to other systems or processes that use outbound JMS message queues. Part of the distribution process described in Main Outbound Data Flow (Distribution).
For more information about batch processes, see the
Multidomain MDM Configuration Guide
,
Multidomain MDM Services Integration Framework Guide
,
Multidomain MDM Data Steward Guide
, and the
Multidomain MDM
Javadoc
.

0 COMMENTS

We’d like to hear from you!