Table of Contents


  1. Preface
  2. Installing Informatica MDM - Relate 360
  3. Configuring Relate 360
  4. Configuring Security
  5. Setting Up the Environment to Process Streaming Data
  6. Configuring Distributed Search
  7. Packaging and Deploying the RESTful Web Services

Installation and Configuration Guide

Installation and Configuration Guide

Step 3. Deploy Relate 360 on Storm

Step 3. Deploy
Relate 360
on Storm

You must deploy
Relate 360
on Storm to link, consolidate, or tokenize the input data. Run the
script located in the following directory to deploy
Relate 360
on Storm:
/usr/local/mdmbdrm-<Version Number>
Use the following command to run the
script: --config=configuration_file_name --rule=matching_rules_file_name --useStorm [--consolidate=consolidation_rules_file_name] [--instanceName=instance_name] [--spoutName=spout_name] [--workers=number_of_worker_processes] [--zookeeper=zookeeper_connection_string] [--skipCreateTopic] [--partitions=number_of_partitions] [--replica=number_of_replicas] [--outputTopic=output_topic_name]
The following table describes the options and arguments that you can specify to run the
Absolute path and file name of the configuration file that you create.
Absolute path and file name of the matching rules file that you create.
The values in the matching rules file override the values in the configuration file.
Indicates to deploy
Relate 360
on Storm.
Absolute path and file name of the consolidation rules file.
Use the consolidation rules file only when you want to consolidate the linked data and create preferred records for all the clusters.
Optional. Name for the topology that processes the input data.
Default is
Optional. Name for the spout that reads the input data and emits the input data into the topology.
Default is
Optional. Number of worker processes for the topology. Each worker process is a physical JVM and runs a subset of all the tasks for the topology.
Default is 3.
Optional. Connection string to access the ZooKeeper server.
Use the following format for the connection string:
<Host Name>:<Port>[/<chroot>]
The connection string uses the following parameters:
  • Host Name
    . Host name of the ZooKeeper server.
  • Port
    . Port on which the ZooKeeper server listens.
  • chroot
    . Optional. ZooKeeper root directory that you configure in Kafka. Default is /.
The following example connection string uses the default ZooKeeper root directory:
The following example connection string uses the user-defined ZooKeeper root directory:
If you use an ensemble of ZooKeeper servers, you can specify multiple ZooKeeper servers separated by commas.
Required if the topic that you specify in the configuration file already exists in Kafka. Indicates to skip creating the topic.
By default, the script creates the topic.
Optional. Number of partitions for the topic. Use partitions to split the data in the topic across multiple brokers. Default is 1.
Optional. Number of replicas that you want to create for the topic. Use replicas for high availability purposes.
Default is 1.
Optional. Name of the topic in Kafka to which you want to publish the output messages. By default, the output messages are not published.
The script does not create the output topic, so ensure that you create the output topic to publish the output messages to it.
For example, the following command runs the script that deploys
Relate 360
on Storm: --config=/usr/local/conf/config_big.xml --rule=/usr/local/conf/matching_rules.xml --useStorm --consolidate=/usr/local/conf/consolidationfile.xml --instanceName=Prospects --zookeeper= --skipCreateTopic --partitions=3 --replica=2 --spoutName=Insurance --workers=5 --outputTopic=InsuranceOutput