Getting Started

10.5
- 10.4.1
- 10.4.0

Back Next

Creating Topics Overview

In this section, you create topics to which applications publish data and from which applications consume data. You must have first completed the chapter "Getting Started with

Data Integration Hub

When you create a topic, you choose the topic type and the type of repository on which to store data for the topic, define the data structure and the data retention period, select a data storage location, and assign topic permissions.

Chapter Concepts

Data Integration Hub

can manage and store topic data on the following types of publication repository:

Relational database. Choose this type of repository to store published data in a relational database structure that represents the structure in which you want to keep the data. For example, data that is published from a relational database or from files. A relational database publication repository usually stores the data for a short intermediate period after the data is consumed by all subscribers.

Data Integration Hub

supports the following databases on which to store relational database topic data: Oracle, Microsoft SQL Server.

Big Data. Choose this type of repository if you publish high volumes of data that you want to store for a long period of time or if you do not want

Data Integration Hub

to delete published data after the data is consumed. The availability of the Hadoop repository depends on whether or not the Hadoop component is installed on your system.

To publish and subscribe to a Hadoop-based repository with custom publications and subscriptions, you must use workflows that are based on a Data Engineering Integration mapping and workflow. When you create a custom publication, if one of the topics that you select for the publication is a Hadoop-based topic, only workflows that are based on a Data Engineering Integration mapping or workflow are listed for selection as the publication mapping.

When you create a compound subscription, that subscribes to multiple topics, all topics that you select must be Hadoop-based, and only workflows that are based on a Data Engineering Integration mapping or workflow are listed for selection as the subscription mapping. You also enable the mandatory option for topics in compound subscription to prioritize a few topics in the compound subscription over other topics.

Data Integration Hub

triggers a processing of subscription after the publication event for all topics are completed. If the wait time of the publication event is complete and

Data Integration Hub

has not published all mandatory topics, an error event is generated during run-time.

Before you use a Hadoop-based publication repository for publications and subscriptions, consider the following restrictions:

You cannot assign a pre-process to a custom publication that publishes to a Hadoop-based repository.

You cannot configure a custom publication that publishes files to a Hadoop-based repository to run immediately when the files are ready to be published.

You cannot use a Hadoop repository to publish and subscribe to pass-through files and Hadoop Distributed File System (HDFS) files.

File Store. Choose this type of repository to publish files that you want to keep as-is without loading the data into a relational database. For example, if you publish PDF or .zip files into a file repository,

Data Integration Hub

delivers the files without processing them.

Real-time. Choose this type of repository to monitor real-time Apache Kafka data streaming. Apache Kafka is a distributed streaming platform that can publish and subscribe to stream of records, store and process streams of record. In order to track the Apache Kafka flows, you must configure the Apache Kafka server URL in the System property of the Data Integration Hub.

You must then create a topic with the publication repository type of Real-time and create an application to define the publisher and subscriber. Also, create a workflow that maps to the Apache Kafka server. The publication and subscription of the Data Integration hub associated with the source and target of the Kafka server.

The Data Integration Hub records streaming of data in the Apache Kafka server at regular intervals. The

Data Integration Hub

operator configures the interval at which

Data Integration Hub

must record the data streaming value in the topic. The Events List stores the log of events. The Processing Information tab in the Events List, stores the Offset and LogEndOffSet values that define the difference between data values at intervals in each partition.

When you create the structure of a topic, you define the data structure on the

Data Integration Hub

publication repository to where the publications that are associated with the topic publish data, and from where subscribers to the topic consume the data. The topic structure must contain at least one table and can consist of multiple tables.

The data retention period defines how long

Data Integration Hub

retains the data in the publication repository after the data is consumed.

Topic permissions control who can access the topic. The

Data Integration Hub

administrator creates categories and assigns categories to user groups to determine the users that can view or change topics. You assign categories to a topic to permit users to view or change the topic. Because publications and subscriptions are associated with topics, they inherit the permissions from the associated topic. When you configure permissions for a topic, only user groups with permissions to the topic can access the associated subscriptions and publications.