Table of Contents

Search

  1. Preface
  2. Introduction to Informatica MDM - Relate 360
  3. Linking Batch Data
  4. Tokenizing Batch Data
  5. Processing Streaming Data
  6. Creating Relationship Graph
  7. Loading Linked and Consolidated Data into Hive
  8. Searching Data
  9. Monitoring the Batch Jobs
  10. Troubleshooting
  11. Glossary

User Guide

User Guide

INGEST Operation

INGEST Operation

The INGEST operation reads the input data and then links or tokenizes the input data based on the value that you set for the
ALTERNATETABLEFORGROUPINFO
parameter in the configuration file. The INGEST operation then loads the linked or tokenized data into the repository.
When you deploy
Relate 360
on Spark or Storm, if you have specified the consolidation rules file, the INGEST operation consolidates the linked data and creates preferred records in the repository. If any input record is an update to an existing record, the INGEST operation updates the record in the repository.
During the linking process, when an input record matches with a record in the repository, the linking process links the input record to the cluster of the matching record. The linking process does not further match the input record with other records.
When you link the input data, ensure that you do not configure the
MaxCandidateSet
parameter in the configuration file. The
MaxCandidateSet
parameter value impacts the maximum number of records that the INGEST operation can add to a cluster.
Run the
run_client.sh
script located in the following directory to perform the INGEST operation:
/usr/local/mdmbdrm-<Version Number>
Use the following command to run the
run_client.sh
script:
run_client.sh --config=configuration_file_name --rule=matching_rules_file_name --operation=INGEST --input=input_file_name [--outputfile=output_file_name]
The following table describes the options and arguments that you can specify to run the
run_client.sh
script:
Option
Argument
Description
--config
configuration_file_name
Absolute path and file name of the configuration file that you create.
--rule
matching_rules_file_name
Absolute path and file name of the matching rules file that you create.
--operation
INGEST
Type of operation that you want to perform. Specify INGEST.
--input
input_file_name
Absolute path and name of the input JSON file that contains the input data.
--outputfile
output_file_name
Optional. Absolute path and name of the output JSON file to which you want to load the processed data.
For example:
run_client.sh --config=/usr/local/tree/configuration.xml --operation=INGEST --input=/usr/local/tree/input.json --rule=usr/local/conf/matching_rules.xml --outputfile=/usr/local/tree/output.json
The following input data contains a record for the INGEST operation to process:
{ "input":{ "SOURCE":"AML", "PERSON":"Susan Shaw", "ADDRESS":"Castle Boulevard", "POSTCODE":"94061", "CITY":"Redwood City", "ROWID":"0000300002" }, "resultCount":0, "messages":{ }
The following sample output shows a record that the INGEST operation submits for processing:
{ "input":{ "SOURCE":"AML", "PERSON":"Susan Shaw", "ADDRESS":"Castle Boulevard", "POSTCODE":"94061", "CITY":"Redwood City", "ROWID":"0000300002" }, "resultCount":1, "messages":{ "Message.1":"Record submitted for Processing" } }

0 COMMENTS

We’d like to hear from you!