The sample cdcPublisherKafka.cfg file contains configuration properties related to the Apache Kafka or MapR Streams target.
Property Descriptions
The following properties are in the sample cdcPublisherKafka.cfg configuration file:
Connector.kafkaProducerPropertiesFile
Required. The path and file name of the Kafka producer.properties file that PowerExchange CDC Publisher uses to communicate with Kafka. This file is typically in the
/
Kafka_installation
/config
directory.
No default value is provided.
Connector.kafkaTopic
Required. The Kafka topic or topics to which PowerExchange CDC Publisher sends messages that contain extracted change data. Enter a specific topic name if you want to send all messages to a single topic, or enter USE_TABLE_NAME to direct the target messaging system to use a separate topic for each source table. A generated source-specific Kafka topic uses the
mapname
_
tablename
portion of the full extraction map name as the topic name. The full extraction map name has the format
schema
.
mapname
_
tablename
. For MapR targets, the topic name includes the stream name in the format
stream_path_name
:
mapname
_
tablename
.
By default, the target messaging system automatically generate topics if the topics do not exist the first time messages are sent to it. You can disable the automatic generation of topics by setting the auto.create.topics.enable parameter to false in Kafka or by setting the autocreate parameter to false in MapR. If you do so, you must manually create the topics before CDC Publisher starts publishing messages.
No default value is provided.
Connector.queueType
Required. The type of target messaging queue to which PowerExchange CDC Publisher streams messages. Valid values are:
kafka
for an Apache Kafka target
maprkafka
for a MapR Streams target that uses the Kafka API for producers
The following additional properties can also be included in the cdcPublisherKafka.cfg file at your discretion:
Connector.checkpointMessageFrequency
Controls the frequency of writing checkpoints to the checkpoint file. Specify the number of target messages that must be written before a checkpoint is taken. The
Connector.checkpointMessageFrequency
setting works in tandem with the Connector.checkpointTimeFrequency setting.
The checkpoint value is used on CDC Publisher restart to determine where to start reading from the change stream. Frequent checkpoints reduce the number of duplicate messages that might be sent when you restart CDC Publisher. Less frequent checkpoints reduce overhead while increasing the number of duplicates that might be sent on restart.
The default is 0. No checkpoint will be taken.
Connector.checkpointPublisherId
If you set the Connector.checkpointsInTarget property to true, this property is required to specify the logical name of the CDC Publisher instance to use for writing checkpoint information to Kafka message topic headers. When PowerExchange CDC Publisher starts, the specified value is compared to the checkpointPublisherId value in the Kafka headers.
Enter a name that uniquely identifies the CDC Publisher instance. If the name is not unique, the checkpoint information might be shared across topics, which can cause data corruption on the target.
Valid values are any string that uniquely identifies the CDC Publisher instance.
No default value is provided.
Connector.checkpointTimeFrequency
Controls the frequency at which checkpoints are written to a checkpoint file. Specifies the number of seconds that must elapse before a checkpoint is written to the checkpoint file. The Connector.checkpointTimeFrequency setting works in tandem with the Connector.checkpointMessageFrequency setting.
The checkpoint value is used on CDC Publisher restart to determine where to start reading from the change stream. Frequent checkpoints reduce the number of duplicate messages that might be sent when you restart CDC Publisher. Less frequent checkpoints reduce overhead while increasing the number of duplicates that might be sent on restart.
The default is 0. No checkpoint will be taken.
Connector.checkpointsInTarget
Controls whether PowerExchange CDC Publisher stores checkpoints for CDC restart processing in Kafka headers on the target or in a local checkpoint file. Kafka version 0.11.0.2 or later is required. Valid values are:
true
. Store checkpoint information in Kafka.
Do not use this option if you have a Kafka version that does not support headers.
false
. Store checkpoint information in a checkpoint file instead of Kafka.
Default is false.
With this property, you can also specify the following related connector configuration properties:
Connector.checkpointMessageFrequency
. Optional.
Connector.checkpointPublisherId
. Required.
Connector.checkpointTimeFrequency
. Optional.
Connector.kafkaCheckpointFileDirectory
A name for the CDC Publisher instance
x
subdirectory to which the CDC Publisher writes checkpoint files. The default value is "checkpoint." Use this property to override the default subdirectory name.
Connector.kafkaCommitDmlTopic
Indicates whether to send all commit messages to a single topic that you specify or to send commit messages to the topic or topics that the Connector.kafkaTopic property identifies. Valid values are:
topic_name
. The name of the Kafka topic to which the CDC Publisher sends all commit messages. Use this option to send commit messages to a topic that is different from the topic that the Connector.kafkaTopic property identifies.
default
. Send commit messages to the topic or topics as identified in the Connector.kafkaTopic property. If the Connector.kafkaTopic property specifies USE_TABLE_NAME, the commit messages, along with the data messages, are sent to the source-specific topics that are generated for each source table that was by the transaction.
Default value is the
default
.
To use this property, you must enable the generation of commit messages in the
Formatter.generate.CommitDML
property in the cdcPublisherAvro.cfg file.
Connector.kafkaCommitDmlTopicFiltering
Indicates whether to filter commit messages before they are sent to the topic or topics that receive messages for DML operations in a transaction. Valid values are:
none
. Do not filter the commit messages to be sent to topics. Send all of the commit messages to each topic that receives messages for the DML operations in the transaction.
MAX_ONE_PER_TOPIC
. Send only one commit message to each topic that receives messages for the DML operations in the transaction.
Default is
MAX_ONE_PER_TOPIC
.
To use this property, you must enable the generation of commit messages in the
Formatter.generate.CommitDML
property in the cdcPublisherAvro.cfg file.
Connector.kafkaConsumerPropertiesFile
Optional. The path and file name of the Kafka consumer.properties file that PowerExchange CDC Publisher uses to communicate with Kafka.
If you do not specify a value and PowerExchange CDC Publisher needs to establish a connection as a Kafka consumer, the producer properties will be used with the appended or overridden values needed for the PowerExchange CDC Publisher consumer requirements.
This file is typically in the
/Kafka_installation/config
directory.
No default value is provided.
Connector.kafkaFileCheckpointFileName
Controls the name of the file to which the CDC Publisher writes checkpoints. This file is located in the subdirectory that the
Connector.kafkaCheckpointFileDirectory
property specifies. Enter a specific file name or DEFAULT. If you enter DEFAULT, the CDC Publisher uses the default file name of "checkpoint." The default property value is DEFAULT.
Connector.kafkaMaprStreamName
Required for MapR Streams targets. The path and name of the existing MapR stream that contains the topic or topics to which the CDC Publisher will publish messages.
In MapR, the stream name is combined with the topic name in the format
path_and_name_of_stream
:
topic_name
to identify a topic. The CDC Publisher combines the stream name that you specify in this property with the topic name that you specify in the Connector.kafkaTopic property.
Connector.kafkaMessageKey
Identifies the key value to include in the messages that the CDC Publisher producer delivers to the target messaging system. You can specify a key value or use the source table name as the key value. Valid values are:
string
. Specify a character string to use as the message key in all messages. The messages will be sent to the same partitions in the target topics.
USE_TABLE_NAME
. Use the source table name as the message key in all messages. Messages for a specific source table will be sent to the same partitions in the target topics.
If you omit this property, no message key is sent to the target messaging system.
to reduce the possibility that a message is sent to the target messaging system more than once or in the wrong order. Valid values are:
true
. Enables guaranteed delivery. CDC Publisher overrides some CDC Publisher connector properties to force a single stream of messages to be synchronously delivered one at a time to a single partition in any target topic. This delivery mode avoids message loss and duplication. CDC Publisher writes a checkpoint after each message is acknowledged as successfully delivered to the target.
false
. Disables guaranteed delivery. If the target messaging system terminates while changes are in flight, CDC Publisher might deliver duplicate messages to the target topics after the change stream is restarted.
Default is true.
Connector.kafkaProducerPartitionID
The numeric partition ID that the CDC Publisher instance assigns to each message that is sent to the target messaging system. CDC Publisher uses this partition ID across all messages and target topics.
Valid values are -1 through 32767.
The value of -1 causes the target messaging system to select the partition ID. If you configure a message key in the
Connector.kafkaMessageKey
property, the target messaging system can use the message key to assign messages to partitions.
If you enter a valid value greater than -1, CDC Publisher writes data to the partition that has specified partition ID. Ensure that a partition with the specified ID exists in all of the topics to which the CDC Publisher will write messages.
Default is -1.
Connector.kafkaTopicAllowSpecialCharacters
Controls whether the period (.) and dash (-) special characters can be used in Kafka target topic names. Also determines whether topic names can begin with a number.
Valid values are:
true
. Allows the period (.) and dash (-) characters to be used in topic names. Also, allows a number to be the first character in topic names.
false
. Causes topic names that include a period or dash character to be either automatically adjusted or rejected. If you set Connector.kafkaTopic=USE_TABLE_NAME to derive topic names from source table names, any period or dash in a table name is translated to an underscore (_) character. If you set Connector.kafkaTopic to a specific topic name that includes period or dash, the topic name is rejected. Also, with this setting, topic names cannot begin with a number.
Default is
false
.
Connector.logConnectorStatsOnExit
Enables PowerExchange CDC Publisher to write runtime statistics to the message log when the Publisher process shuts down. The statistics are printed in message CDCPUB_15034. The statistics are the same as those produced by the PWXCDCINFO STATS=TOPIC command.
true
. Prints runtime statistics at shutdown.
false
. Does not print runtime statistics at shutdown.
Default is
false
.
Connector.restartCheckpointSource
Allows an override of where the checkpoint information is to be derived from. Use this property if checkpoint sources become corrupted and an override is needed to specify where checkpoints are to be obtained. Valid values are:
all
. Checkpoints can be collected from the target and the backup checkpoint file.
target
. Acquire the checkpoint from the target.
file
. Acquire the checkpoint from the backup file.
Default is all.
Connector.sendMaintainOrder
Controls whether the CDC Publisher maintains the order in which change operations were retrieved from the source when sending messages to the target messaging system. Valid values are:
true
. Sends messages synchronously to the same partition in each target topic in the order that the change operations were retrieved from the source.
false
. Sends messages asynchronously to one or more partitions in target topics as soon as possible, without regard for the order in which the change operations were retrieved from the source.