Table of Contents

Search

  1. Preface
  2. PowerExchange CDC Publisher Overview
  3. Installing and Upgrading PowerExchange CDC Publisher
  4. PowerExchange CDC Publisher Key Concepts
  5. PowerExchange Change Capture Environment
  6. Target Messaging Systems
  7. Configuring PowerExchange CDC Publisher
  8. Streaming Change Data
  9. Monitoring PowerExchange CDC Publisher
  10. Administering PowerExchange CDC Publisher
  11. Appendix A: Command Reference for the Command-Line Utilities
  12. Appendix B: Avro Schema Formats
  13. Appendix C: Custom Pattern Formats
  14. Appendix D: Message Reference

User Guide

User Guide

If Informatica Data Engineering Streaming Will Consume Data from a Target

If Informatica Data Engineering Streaming Will Consume Data from a Target

PowerExchange CDC Publisher streams change data that PowerExchange captured in near real time to target messaging systems such as Apache Kafka. Informatica Data Engineering Streaming can then consume the change data from the target message queue and use it for a variety of purposes. For example, Data Engineering Streaming can use the change data to generate near-real-time fraud detection alerts or customize sales offers at point-of-sale.
If the Data Engineering Streaming product will consume the change data that the PowerExchange CDC Publisher sends to a target messaging system, use the following PowerExchange CDC Publisher configuration guidelines:
  • Data Engineering Streaming cannot consume data from fields that have a binary data type. Configure the PowerExchange CDC Publisher to send data from binary fields as string data by setting the following properties in the cdcPublisherAvro.cfg configuration file:
    • Formatter.avroBinaryAsString=true
      . With this setting, binary data is represented as string data in the generated Avro messages.
    • Formatter.avroBinaryStringRepresentationType=(
      base64
      |hexadecimal}
      . When
      Formatter.avroBinaryAsString=true
      , this property determines whether to use base64 or hexadecimal strings to represent binary data. Default is base64.
  • Data Engineering Streaming cannot consume JSON-encoded Avro messages. To use binary-encoded messages, specify
    Formatter.avroEncodingType=binary
    in the cdcPublisherAvro.cfg configuration file.
  • As a consumer application, Data Engineering Streaming must have copies of the Avro schemas for the source tables to properly interpret the change data in the messages. You can use the REPORT=FORMAT parameter of the PwxCDCAdmin utility to report the existing Avro schemas in a legible format for use by consumer applications. If no Avro schemas have been generated for the source tables, the utility attempts to create the Avro schemas based on the properties in the cdcPublisherAvro.cfg configuration file. For more information, see PwxCDCAdmin Utility - Command and Parameters.
  • If you try to import an Avro schema that the PowerExchange CDC Publisher generated for a very large table and that is larger than 65535 bytes into Data Engineering Streaming, the Scala compiler issues a Java exception related to the scala.tools.asm package. This problem occurs because the Scala code does not handle literals greater than 65535 bytes in size. To circumvent this problem, you can configure the PowerExchange CDC Publisher to generate Avro schema in a minimized format by specifying some or all of the following properties in the cdcPublisherAvro.cfg configuration file:
    • Formatter.avroSchemaPrintPretty={
      true
      |false}
      . Set this property to false to
      not
      include the spaces and line feeds that are intended to improve legibility in the generated Avro schemas. Default value is true, which causes the spaces and line feeds to be included.
    • Formatter.avroSchemaPrintDocFields={
      true
      |false}
      . Set this property to false to
      not
      report the "doc" fields in the generated Avro schemas. The doc fields include metadata such as the CDC and PowerExchange datatypes, precision, and scale. Default value is true, which causes this information to be included.
    • Formatter.avroSchemaPrintDefaultFields={
      true
      |false
      }. Set this property to false to
      not
      include the "default" fields in the generated Avro schemas. Default value is true, which causes the default fields to be included.

0 COMMENTS

We’d like to hear from you!