Operator Guide

10.4.0
- 10.5
- 10.4.1

Back Next

Publication Repository Types

When you create a topic, you choose the type of publication repository in which

Data Integration Hub

manages and stores published data for the topic.

Data Integration Hub

can store topic data on the following types of publication repository:

Relational database. Choose this type of repository to store published data in a relational database structure that represents the structure in which you want to keep the data. For example, data that is published from a relational database or from files. A relational database publication repository usually stores the data for a short intermediate period after the data is consumed by all subscribers.

Data Integration Hub

supports the following databases on which to store relational database topic data: Oracle, Microsoft SQL Server.

Big Data. Choose this type of repository if you publish high volumes of data that you want to store for a long period of time or if you do not want

Data Integration Hub

to delete published data after the data is consumed. The availability of the Hadoop repository depends on whether or not the Hadoop component is installed on your system.

To publish and subscribe to a Hadoop-based repository with custom publications and subscriptions, you must use workflows that are based on a Data Engineering Integration mapping and workflow. When you create a custom publication, if one of the topics that you select for the publication is a Hadoop-based topic, only workflows that are based on a Data Engineering Integration mapping or workflow are listed for selection as the publication mapping.

When you create a compound subscription, that subscribes to multiple topics, all topics that you select must be Hadoop-based, and only workflows that are based on a Data Engineering Integration mapping or workflow are listed for selection as the subscription mapping. You also enable the mandatory option for topics in compound subscription to prioritize a few topics in the compound subscription over other topics.

Data Integration Hub

triggers a processing of subscription after the publication event for all topics are completed. If the wait time of the publication event is complete and

Data Integration Hub

has not published all mandatory topics, an error event is generated during run-time.

Before you use a Hadoop-based publication repository for publications and subscriptions, consider the following restrictions:

You cannot assign a pre-process to a custom publication that publishes to a Hadoop-based repository.

You cannot configure a custom publication that publishes files to a Hadoop-based repository to run immediately when the files are ready to be published.

You cannot use a Hadoop repository to publish and subscribe to pass-through files and Hadoop Distributed File System (HDFS) files.

File Store. Choose this type of repository to publish files that you want to keep as-is without loading the data into a relational database. For example, if you publish PDF or .zip files into a file repository,

Data Integration Hub

delivers the files without processing them.

Real-time. Choose this type of repository to monitor real-time Apache Kafka data streaming. Apache Kafka is a distributed streaming platform that can publish and subscribe to stream of records, store and process streams of record.

In order to track the Kafka flows, you must configure the Apache Kafka server URL in the System property of the Data Integration Hub.

You must then create a topic with the publication repository type of Real-time and create an application to define the publisher and subscriber. Also, create a workflow that maps to the Apache Kafka server. The publication and subscription of the Data Integration hub associated with the source and target of the Kafka server.

The Data Integration Hub records streaming of data in the Apache Kafka server at regular intervals. The

Data Integration Hub

operator configures the interval at which

Data Integration Hub

must record the data streaming value in the topic. The Events List stores the log of events. The Processing Information tab in the Events List, stores the Offset and LogEndOffSet values that define the difference between data values at intervals in each partition. For more information about events, see Managing Events on the Event List Page.

When you create a compound subscription, that is, a subscription that consumes data sets from multiple topics with a single batch workflow, all topics must be of the same type.

Data Integration Hub

operator enables the mandatory option for topics in compound subscription to prioritize a few topics in the compound subscription over other topics.

Data Integration Hub

triggers a processing of subscription after the publication event for all topics are completed. If the wait time of the publication event is complete and

Data Integration Hub

has not published all mandatory topics, an error event is generated during run-time.

Rename Saved Search

Table of Contents

Operator Guide

Operator Guide

Publication Repository Types

Publication Repository Types