Table of Contents

Search

  1. Preface
  2. PowerExchange CDC Publisher Overview
  3. Installing and Upgrading PowerExchange CDC Publisher
  4. PowerExchange CDC Publisher Key Concepts
  5. PowerExchange Change Capture Environment
  6. Target Messaging Systems
  7. Configuring PowerExchange CDC Publisher
  8. Streaming Change Data
  9. Monitoring PowerExchange CDC Publisher
  10. Administering PowerExchange CDC Publisher
  11. Appendix A: Command Reference for the Command-Line Utilities
  12. Appendix B: Avro Schema Formats
  13. Appendix C: Custom Pattern Formats
  14. Appendix D: Message Reference

User Guide

User Guide

Product Overview

Product Overview

The PowerExchange CDC Publisher is a Java-based tool that streams change data that has been captured from a PowerExchange data source to a target messaging system such as Apache Kafka. This tool is licensed as an option of the PowerExchange CDC product.
Typically, the PowerExchange CDC Publisher runs continuously as a Linux daemon or as a Windows foreground process until you stop it. It acts as a client of both the PowerExchange system and target messaging system. You can run the PowerExchange CDC Publisher on any Linux or Windows system in your environment, including a system that is remote from the data source, target messaging system, and PowerExchange Logger for Linux, UNIX, and Windows.
The PowerExchange CDC Publisher retrieves change data from the PowerExchange Logger log files. The CDC Publisher process creates a child extraction process that connects to the system that contains the PowerExchange Logger log files to read change data.
The PowerExchange Logger logs units of work (UOWs) in commit order. By default, the PowerExchange CDC Publisher maintains the order of DML change operations from the source when streaming change data to the target messaging system. However, if you configure CDC Publisher to stream data to multiple partitions in a single topic, CDC Publisher cannot ensure that change operations are written to the target messaging system in the same order that they were received from the source.
You can run the PowerExchange Logger in continuous mode or batch mode. However, if you configure the PowerExchange Logger to run in batch mode and stop at the "end of log," the CDC Publisher streams the change data in bursts as the batch Logger process makes the changes available in the log files.
When establishing a change data stream, the PowerExchange CDC Publisher performs the following processing:
  1. Retrieves a list of extraction map names that match the schema name that you specified.
  2. If you defined filtering criteria for source tables or objects, selects the extraction maps that match your filters for use in extraction processing.
  3. Begins extracting change data.
  4. When the first change for an extraction map is received, generates an Avro schema for the source object.
  5. Formats the extracted source change records into messages based on the Avro schemas or custom pattern formats that you define.
  6. Connects to the target messaging system as a producer to send the formatted messages to target topics.
After the change data is available in the target messaging system, consumer applications, such as Informatica Data Engineering Streaming, can consume the data for a variety of purposes. The consumer applications must have copies of the Avro schemas that PowerExchange CDC Publisher generated to decode the messages.
The PowerExchange CDC Publisher includes optional utilities for monitoring and administering a CDC Publisher process and for generating legible copies of the Avro schemas for consumer application use. You can run these utilities locally or from a remote system. To control access to the utility script files, you must use file system security. PowerExchange CDC Publisher does not provide security on the utility script files or administrative functions.

0 COMMENTS

We’d like to hear from you!