Table of Contents

Search

  1. Preface
  2. PowerExchange CDC Publisher Overview
  3. Installing and Upgrading PowerExchange CDC Publisher
  4. PowerExchange CDC Publisher Key Concepts
  5. PowerExchange Change Capture Environment
  6. Target Messaging Systems
  7. Configuring PowerExchange CDC Publisher
  8. Streaming Change Data
  9. Monitoring PowerExchange CDC Publisher
  10. Administering PowerExchange CDC Publisher
  11. Command Reference for the Command-Line Utilities
  12. Avro Schema Formats
  13. Message Reference

User Guide

User Guide

Avro Formatter Configuration Properties

Avro Formatter Configuration Properties

The sample cdcPublisherAvro.cfg file contains configuration properties that define the format of the generated Avro schema, the encoding type to use for serializing the Avro records to be included in messages, and several optional Formatter settings.

Property Descriptions

The following properties are in the sample cdcPublisherAvro.cfg file:
Formatter.formatterType
The type of data serialization formatter to use for messages. The only valid value is
Avro
.
Formatter.avroSchemaFormat
Required. The Avro schema format that the PowerExchange CDC Publisher uses to generate the Avro schema that will determine the structure of the message values. Valid values are:
  • avroFlatSchemaFormatV1
    . Structures messages by using a flat Avro schema format, which lists all Avro fields in one Avro record. A unique Avro schema is generated for each source object, which contains the Avro field definitions.
  • avroNestedSchemaFormatV1
    . Structures messages by using a nested Avro schema format, which provides a main Avro record that contains a separate nested record for each type of Avro field.
  • avroGenericSchemaFormatV1
    . Structures messages in a generic manner that accommodates any source object definition. All source columns are represented by an array. Each array entry contains column data and metadata. The source column names are included in each data record, allowing the generic schema to be independent of the source table.
No default value is provided.
You can "wrap" a flat, nested, or generic schema by setting the Formatter.avroWrapperSchemaFormat property to avroWrapperSchemaFormatV1. The schema then consists of four fields for each source object.
Use a generic or wrapper schema to allow a single Avro schema to represent multiple source tables. For more information about the schema formats, see Avro Schema Formats.
Formatter.avroEncodingType
Required. The Avro encoding type that the CDC Publisher Formatter uses to serialize the Avro records to be included in messages. Valid values are:
  • binary
    . Use binary encoding to serialize Avro records.
  • json
    . Use JSON to serialize Avro records.
  • none
    . Do not use any explicit encoding type. Specify this option only if you use Confluent Schema Registry in a Kafka target environment.
No default value is provided.
The following additional properties can also be included in the cdcPublisherAvro.cfg file at your discretion:
Formatter.avroBinaryAsString
Controls whether change data with a binary datatype is represented as string data in Avro messages. Set this property to true if the data will be consumed by applications that do not support binary data, such as Informatica Big Data Streaming. The default value is false.
Formatter.avroIncludeBeforeImage
Controls whether the generated Avro schema and messages include a field for before-image data. Set this property to true to include this field. Set this property to false to not include this field.
If you include the before-image field, the field is populated with data for UPDATE operations, if you set the Extract.pwxUpdateImageOption property to enable the extraction of before-image data from the PowerExchange change stream. For DELETE and INSERT operations, the field is not populated with data.
The default value is true.
Formatter.avroBinaryStringRepresentationType
If you set the Formatter.avroBinaryAsString property to true or use a generic Avro format, indicates whether binary data is represented as a hexadecimal string or base64 string. Valid values are:
  • hexadecimal
  • base64
The default value is base64.
Formatter.avroDisplaySchemaWithEscapedQuotes
If you use Confluent Schema Registry in a Kafka target environment and need to manually add an Avro schema to the registry as a single string that is delimited by double-quotation marks, set this parameter to true to use a backslash (/) as the escape character that precedes the double-quotation marks. Then run the PwxCDCAdmin utility with the REPORT=FORMAT parameter to generate a schema definition that includes the escape character before each delimiter, for example,
/"
schema_string
/"
. You can then use the generated schema definition to add the schema to Confluent Schema Registry. The default value is false, which disables the use of escaped double-quotation marks in generated schema.
Formatter.avroSchemaPrintDefaultFields
Controls whether Avro schemas include the "default" fields. If you need to reduce the schema size, you can set this property to false to exclude the default fields. The default value is true, which includes the default fields.
Formatter.avroSchemaPrintDocFields
Controls whether Avro schemas include the "doc" fields. The doc fields include metadata such as the CDC and PowerExchange datatypes, precision, and scale. If you need to reduce the schema size, you can set this property to false to exclude the doc fields. The default value is true, which includes the doc fields.
Formatter.avroSchemaPrintPretty
Controls whether Avro schemas include spaces and line feeds to improve legibility. If you need to reduce the schema size, you can set this property to false to exclude the spaces and lines feeds. The default is true, which includes the spaces and line feeds.
Formatter.avroWrapperSchemaFormat
Enables the use of an Avro "wrapper" schema format. The wrapper schema can be used to describe any source object. The wrapper, or parent, schema consists of four fields for each source object: the sequence number of the change record, source table name, change operation type, and the "wrapped" Avro child schema expressed as a large string. The consumer application can then parse the underlying data and put it in the proper Avro format for the source object. To use a wrapper schema format, set this property to
avroWrapperSchemaFormatV1
. No default value is provided. For more information, see Avro Wrapper Schema Format.
Formatter.avroUseLogicalDateType
Formatter.avroUseLogicalDecimalType
Formatter.avroUseLogicalTimeMillisType
Formatter.avroUseLogicalTimeMicrosType
Formatter.avroUseLogicalTimestampMillisType
Formatter.avroUseLogicalTimestampMicrosType
If you use Avro logical types for dates, decimal values, times, or timestamps and want the CDC Publisher to make a best-effort attempt to process these logical types, set this property to true. The following sets of properties are mutually exclusive so specify one property or the other but not both:
  • Formatter.avroUseLogicalTimeMillisType and Formatter.avroUseLogicalTimeMicrosType
  • Formatter.avroUseLogicalTimestampMillisType and Formatter.avroUseLogicalTimestampMicrosType
The default value for each of these properties is false.
If you set a property to true, make sure that the source fields are defined in the extraction map with a compatible data type, scale, and precision.
Formatter.formatterAddTimestampColumn
Indicates whether the PowerExchange CDC Publisher adds a timestamp column to the generated Avro schema and in formatted output messages to represent the time at which the Formatter processed the incoming change records. Valid values are:
  • false
    . Do not add the timestamp column.
  • true
    . Add the timestamp column.
Default is false.
If you set this property to true, you can optionally specify the column name, timestamp format, and time zone in the following properties: Formatter.formatterAddedTimestampColumnName, Formatter.formatterAddedTimestampColumnFormat, and Formatter.formatterAddedTimestampColumnTimezone properties.
Formatter.formatterAddedTimestampColumnFormat
If Formatter.formatterAddTimestampColumn is set to true, you can use this property to indicate the format of the timestamp values that can be included in the added timestamp metadata column. You can enter any character string that the Java class SimpleDateFormat supports for formatting dates and times. For more information, see https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. Default is yyyy/MM/dd HH:mm:ss.SSS.
Formatter.formatterAddedTimestampColumnName
If Formatter.formatterAddTimestampColumn is set to true, you can use this property to specify the name of the added timestamp metadata column. This column will appear in the generated Avro schema and messages. Enter any alphanumeric character string. Default is INFA_TIME_CREATED.
Formatter.formatterAddedTimestampColumnTimezone
If Formatter.formatterAddTimestampColumn is set to true, you can use this property to control the time zone in which the timestamp value in the added timestamp metadata column is reported. Valid values are:
  • local
    . The local time zone where the CDC Publisher runs.
  • UTC
    . Coordinated Universal Time.
Default is
local
.
Formatter.generateCommitDML
Indicates whether the Formatter generates messages for transaction commit operations. Also indicates whether the Formatter generates a commit message for each source table that was updated by the committed transaction or generates one commit message for all of the updated tables by using the schema of the last updated table. Valid values are:
  • none
    . Do not generate messages for commit operations.
  • LAST_TABLE
    . Generate a single commit message for all source tables that the transaction updated. The Formatter generates the commit message by using the Avro schema of the last source table that was updated by the transaction.
  • ALL_TABLES
    . Generate a commit message for each source table that was updated by the transaction. Consider using this option if you configured CDC Publisher to generate one topic per source table.
Default is
none
.
If you enable the generation of commit messages, you can optionally set the Connector.kafkaCommitDmlTopic and Connector.kafkaCommitDmlTopicFiltering properties in the cdcPublisherKafka.cfg file.

0 COMMENTS

We’d like to hear from you!