Table of Contents

Search

  1. Preface
  2. PowerExchange CDC Publisher Overview
  3. Installing and Upgrading PowerExchange CDC Publisher
  4. PowerExchange CDC Publisher Key Concepts
  5. PowerExchange Change Capture Environment
  6. Apache Kafka Targets
  7. Configuring PowerExchange CDC Publisher
  8. Streaming Change Data
  9. Monitoring PowerExchange CDC Publisher
  10. Administering PowerExchange CDC Publisher
  11. Command Reference for the Command-Line Utilities
  12. Avro Schema Formats

User Guide

User Guide

Publisher-Generated Messages for Data Delivery

Publisher-Generated Messages for Data Delivery

PowerExchange CDC Publisher creates a message for each change record that it receives for a source table or object and sends the message to the target messaging system. The format of the message is based on an Avro schema that the PowerExchange CDC Publisher generates. To generate an Avro schema, the PowerExchange CDC Publisher uses the PowerExchange extraction map for the source object.
PowerExchange CDC Publisher does not generate messages for DDL operations or for UOW begin and commit records.

Message Content

A message contains extracted change data and metadata. The following table describes the fields that are included in a generated message:
Field
Description
DTL__
xxx
The PowerExchange-generated DTL__ columns that have been added to extraction maps by default or by the PowerExchange user. For more information about these columns, see the
PowerExchange Navigator User Guide
.
INFA_OP_TYPE
The change operation type (INSERT, UPDATE, or DELETE) that was extracted from the source.
INFA_TABLE_NAME
The source
mapname
_
tablename
from the extraction map name. This value identifies the source object for which change data was extracted.
INFA_SEQUENCE
A sequence number that the CDC Publisher assigns to the change record.
source_column_name
The after image of a source column to which a change operation was applied.
source_column_name
_Present
An indicator of whether the column contains a value from a change operation.
source_column_name
_BeforeImage
The before image value of an updated source column.
source_column_name
_BeforeImage_Present
An indicator of whether the before image of an updated source column is present.
For UPDATE operations, the Avro messages include both the before image and after image. You can set the Formatter.avroIncludeBeforeImage property to false in the cdcPublisherAvro.cfg configuration file to not generate Avro fields for before images.
If you specify CAPTURE_IMAGE=AI in the PowerExchange Logger pwxccl.cfg file to capture after images only, no before image data will be available.

Message Format and Encoding

PowerExchange CDC Publisher produces messages only in Avro format with the encoding type that you select.
You control the Avro encoding by setting the Formatter.avroEncodingType configuration property in the cdcPublisherAvro.cfg configuration file. You can specify an encoding type of
JSON
,
binary
, or
none
. The CDC Publisher uses the specified encoding type when serializing records in an Avro format. Set the encoding type to
none
, which indicates no explicit encoding, if you use a third-party encoding schema, such as the Confluent Schema Registry encoding schema.
The PowerExchange CDC Publisher generates an Avro schema when it receives change data for a source table or object. The PowerExchange CDC Publisher can generate Avro schemas in one of the following formats, depending on how you set the Formatter.avroSchemaFormat property in the cdcPublisherAvro.cfg configuration file:
  • Flat
    . Lists all Avro fields in one Avro record. A unique Avro schema is generated for each source object.
  • Nested
    . Organizes each type of information in a separate Avro record. A unique Avro schema is generated for each source object.
  • Generic
    . Generates an Avro schema that accommodates more than one source object. The source column names are included in each record, which allows the generic schema to be independent of any source object. All of the PowerExchange-generated DTL__
    xxx
    columns are included as metadata.
Additionally, you can "wrap" a flat, nested, or generic schema with an Avro schema that acts as a header that contains metadata followed by the underlying Avro schema for the source object. To do so, set the Formatter.avroWrapperSchemaFormat property to avroWrapperSchemaFormatV1 in the cdcPublisherAvro.cfg file. The wrapper schema format contains the following Avro fields:
  • Sequence number of the change record
  • Change operation type
  • Source
    mapname_tablename
    from the extraction map name
  • The "wrapped" Avro schema of the type specified in the Formatter.avroSchemaFormat property
All messages based on the wrapper schema have the same four-field format.
You can use a generic or wrapper schema to represent multiple source tables. The consumer of data in the target topics examines the metadata to determine which source table is represented and discover the source table structure. The consumer can then parse the underlying data and put it into the proper Avro format. Consider using a generic or wrapper schema when you want to send messages with change data from multiple source objects to a single Kafka topic. The topic is identified in the Connector.kafkaTopic property.
A flat or nested schema pertains to a specific source table. It defines the columns in the source table. Consider using a flat or nested schema when you want to send change data from a specific source object to the Kafka topic that is generated for that source object. In this case, set the Connector.kafkaTopic property to USE_TABLE_NAME.

Supported Avro Types

The Avro schemas that the CDC Publisher generates support all Avro primitive types except FLOAT or DOUBLE, which are represented as strings. Also, if you use Avro logical types for dates, decimal values, times, or timestamps, the CDC Publisher makes a best-effort attempt to process the logical types under the following conditions:
  • A Formatter.avroUseLogical
    type
    property is set to true in the cdcPublisherAvro.cfg configuration file, where
    type
    is DateType, DecimalType, TimeMillisType, TimeMicrosType, TimestampMillisType, or TimestampMicrosType.
  • The source field is defined in the extraction map with a compatible data type, scale, and precision.
The CDC Publisher does not convert a timestamp to a date type.
The Publisher-generated Avro schemas do not support complex types.

Avro Schema Usage Considerations

Consumer applications that read data from the messaging system must have a copy of the Avro schema to decode the message. You can use the PwxCDCAdmin utility to generate Avro schemas in a legible format that consumer applications can use. For more information, see Reporting the Avro Format Definitions for Source Tables.
If you change the structure of a source table or object and update the extraction map, or if you change any of the Avro-related properties in the cdcPublisherAvro.cfg configuration file, the PowerExchange CDC Publisher does not automatically update any existing Avro schema. You can use the PwxCDCAdmin utility to clear the existing Avro schema from cache and then regenerate the Avro schema. For more information, see Handling Changes to Source Tables and Extraction Maps.

0 COMMENTS

We’d like to hear from you!