Using Heartbeat Tables to Determine the Latency of Change Data Flow

Using Heartbeat Tables to Determine the Latency of Change Data Flow

Configuring Heartbeat Tables to Monitor Data Latency for CDC Sessions

Configuring Heartbeat Tables to Monitor Data Latency for CDC Sessions

Complete the following steps to configure source and target heartbeat tables and include them in a PowerCenter CDC session to monitor change data latency.
This procedure uses Oracle source and target heartbeat tables that are named PM_SRC_HEARTBEAT and PM_TGT_HEARTBEAT, as an example. If you prefer, you can create the heartbeat tables in other types of databases that PowerExchange and PWXPC support and use different table and column names. However, if you use a different database type, you might need to use different column datatypes.
  1. In the CDC source database, create the source heartbeat table.
    Use a table name that implies the table function, such as PM_SRC_HEARTBEAT. The table contains one column named SRC_TSTAMP, which holds the system time stamp that marks when an update occurs.
    For example, use the following DDL to create an Oracle table that has a column with the datatype of TIMESTAMP:
    CREATE TABLE PM_SRC_HEARTBEAT ( SRC_TSTAMP TIMESTAMP DEFAULT SYSTIMESTAMP NOT NULL, CONSTRAINT PM_SRC_HEARTBEAT_PK PRIMARY KEY (SRC_TSTAMP) ); COMMIT;
  2. In the PowerExchange Navigator, create a capture registration for the PM_SRC_HEARTBEAT table. Set the registration
    Status
    option to
    Active
    , and set the
    Condense
    option to
    Part
    .
    PowerExchange generates an associated extraction map. The extraction map includes the generated DTL__CAPXTIMESTAMP column. When you import the extraction map into PowerCenter to create a source definition, this time stamp column is included in the source definition.
  3. Create a script that updates the SRC_TSTAMP column in the PM_SRC_HEARTBEAT table with the current system time stamp.
    The script should perform an update instead of an insert to prevent the table from becoming too large. You can use the following sample DDL:
    update PM_SRC_HEARTBEAT set SRC_TSTAMP = CURRENT_TIMESTAMP; COMMIT;
    Schedule the script to run at a regular interval, such as 5 seconds or 5 minutes. The frequency depends on your environment.
    For the script to run successfully, it must have an initial value to update. Before you run the script for the first time, manually perform an insert that populates the SRC_TSTAMP column with current system time stamp.
  4. On the target system, create the target heartbeat table.
    Use a table name that implies the table function, such as PM_TGT_HEARTBEAT. The tables contains five columns, including a SRC_TSTAMP column that corresponds to the SRC_TSTAMP column in the PM_SRC_HEARTBEAT table. The other columns are:
    • SESSION_NAME. Contains the session name from the PowerCenter CDC session.
    • CAPX_TSTAMP. Contains the DTL_CAPXTIMESTAMP date and time in a format that can be compared with the other time stamps. This value is the time stamp that was recorded by the source database when an update was written to the PM_SRC_HEARTBEAT table.
    • MAP_TSTAMP. Contains the current system time stamp on the PowerCenter Integration Service system when the source update was processed by the CDC session.
    • TGT_TSTAMP. Contains the current system time stamp on the target when a row for a source update is inserted in the PM_TGT_HEARTBEAT table for a session.
    You can use the following DDL to create an Oracle table that includes these columns:
    CREATE TABLE PM_TGT_HEARTBEAT ( SESSION_NAME VARCHAR2(30) not null, SRC_TSTAMP TIMESTAMP not null, CAPX_TSTAMP TIMESTAMP not null, MAP_TSTAMP TIMESTAMP not null, TGT_TSTAMP TIMESTAMP DEFAULT SYSTIMESTAMP not null, CONSTRAINT PM_TGT_HEARTBEAT_PK PRIMARY KEY (SESSION_NAME,SRC_TSTAMP) ); COMMIT;
    In the DDL for the TGT_TSTAMP, the clause "DEFAULT SYSTIMESTAMP not null" causes the target system time stamp to be written to this column. You do not have to create a script to perform this function.
  5. In PowerCenter Designer, add the source and target heartbeat tables, an Expression transformation, and an Update Strategy transformation to an existing mapping for a workflow that you want to monitor:
    1. In the Source Analyzer, import the PowerExchange extraction map for the PM_SRC_HEARTBEAT table to create the source definition.
    2. In the Target Designer, create a target definition for the PM_TGT_HEARTBEAT table.
    3. In the Mapping Designer, drag the source and target definitions into a mapping that is associated with a workflow that you want to monitor.
    4. Create and configure an Expression transformation.
      On the
      Ports
      tab of the
      Edit Transformations
      dialog box, complete the following actions:
      • To add the SESSION_NAME and MAP_TSTAMP columns, add the following built-in variables in the
        Expressions
        field:
        • $PMSessionName. This variable populates the SESSION_NAME column with a session name. Set the
          Port Type
          field to
          Output
          , use a datatype of
          string
          , and set the
          Precision
          field to a value that accommodates the length of your session names.
        • SYSDATE. This variable populates the MAP_TSTAMP column with the system time stamp at the point when the source heartbeat data flowed through the mapping. Set the
          Port Type
          field to
          Output
          , and use a datatype of
          date/time
          .
      • To convert the DTL__CAPXTIMESTAMP input values to a format that can be compared to the other time stamps, add the CAPX_TSTAMP as an output port. Then, enter the following TO_DATE function for the CAPX_TSTAMP port:
        TO_DATE(DTL__CAPXTIMESTAMP,'YYYYMMDDHH24MISSUS')
      The following figure shows how the
      Ports
      tab in the
      Edit Transformation
      dialog box should appear when you are done:
    5. Repeat Step d. to create another Expression transformation for the target.
    6. Create and configure an Update Strategy transformation.
      On the
      Properties
      tab, open the Expression Editor for the
      Update Strategy Expression
      attribute, and then enter
      DD_INSERT
      . This function causes all rows to be treated as inserts to the PM_TGT_HEARTBEAT table.
      The following figure shows how the
      Properties
      tab should appear when you are done:
    7. In the mapping, place the Expression transformation and Update Strategy transformations between the source and target definitions. The Expression translation should precede the Update Strategy transformation. Then, link the ports between all elements in the flow, including the source definition, source qualifier, transformations, and target definition.
      The following figure shows an example mapping that includes the heartbeat tables in the top flow:
  6. In Workflow Manager, configure the connections and session for the workflow that you want to monitor.
    1. Create and configure connections for the source and target databases, if they are not available.
      For the source, use a PWX Oracle CDC Real Time application connection. For the target, use a PowerCenter relational connection.
    2. In Task Developer, double-click the CDC session in the workflow to edit session properties.
      In the
      Edit Tasks
      dialog box, set the following attributes:
      • On the
        Properties
        tab, set the
        Treat source rows as
        attribute to
        Data driven
        .
      • On the Mapping tab, under
        Sources
        , click the source qualifier for the PM_SRC_HEARTBEAT table, and then click the pencil icon for the source connection. In the
        Connection Object Definition
        dialog box, set the
        Image Type
        attribute to
        AI
        . This setting causes the workflow to write only after images of change data to the PM_TGT_HEARTBEAT table and to all other targets in the CDC session.
  7. Start the workflow that includes the heartbeat tables.
  8. Analyze the time stamp data that is written to PM_TGT_HEARTBEAT table for the session.
As the PowerCenter CDC session moves change data of interest to the target, the PM_TGT_HEARTBEAT table concurrently accumulates time stamp information for the session.
You can configure other CDC sessions and mappings to write time stamp information to the same PM_TGT_HEARTBEAT table. A table row provides information for a specific session.

0 COMMENTS

We’d like to hear from you!