PowerExchange CDC Publisher generates a checkpoint file after it sends the first change operation. As data streaming progresses, the CDC Publisher saves information about the last change operation processed to the checkpoint file. This checkpoint information is used to resume CDC Publisher apply processing after the CDC Publisher is restarted.
The CDC Publisher uses only one checkpoint file for each instance. By default, the file is named "checkpoint" and is created in the "checkpoint" subdirectory of an instance. You can change the file name and directory by specifying the Connector.kafkaFileCheckpointFileName and Connector.kafkaCheckpointFileDirectory properties in the cdcPublisherKafka.cfg configuration file.
The checkpoint file contains checkpoint information only for the last change operation processed. The checkpoint format is specific to the CDC Publisher and should not be edited. When a checkpoint is written depends on whether you set the Connector.kafkaProducerGuaranteeDelivery property to false in the cdcPublisherKafka.cfg file or accept the default value of true. With the default value of true, the CDC Publisher uses
guaranteed delivery
to write a checkpoint after each change operation is acknowledged as successfully received by the target messaging system. This delivery mode avoids message loss and duplication but slows apply processing. If you set this property to false and the target messaging system terminates while changes are in flight, the CDC Pubisher will not skip any change operations but might apply duplicate messages to the target messaging system after the change stream is restarted.
The following considerations pertain to using checkpoints and checkpoint files:
If connectivity to the target messaging system is lost or the CDC Publisher terminates, the CDC Publisher process will restart from the checkpoint position that is recorded in the checkpoint file by default. In this situation, some messages might be duplicated on the target messaging system. To guarantee that messages are not duplicated, ensure that the Connector.kafkaProducerGuaranteeDelivery property is set to true.
If the PowerExchange CDC Publisher process ends abnormally, the checkpoint value in the checkpoint file might not be accurate in the following situations:
The existing checkpoint value does not reflect the last change operation because the change stream terminated after the CDC Publisher sent a message to the target messaging system and before the target acknowledged the message as received. In this case, the CDC Publisher still restarts from the existing checkpoint value. Some messages that were previously sent to the target messaging system might be resent.
The checkpoint file is corrupted. This situation can occur if an attempt to write a checkpoint value to the checkpoint file failed or did not complete. In this case, delete the checkpoint file. Then configure a restart point by setting the Extract.restart1 and Extract.restart2 properties in the cdcPublisherPowerExchange.cfg file. When you restart the CDC Publisher, use the RESTART=FROM_CONFIG parameter. If you do not configure a specific restart point, the CDC Publisher restarts from the oldest point in the log files, as if RESTART=FROM_BEGINNING is specified.