When you create a capture registration for a source table, the PowerExchange Navigator generates a corresponding extraction map and application name for the extraction. The extraction map describes the columns for which to extract change data. You can edit the extraction map to remove columns from extraction processing. Also, you can create alternative extraction maps, each for a subset of the columns that are registered for capture. For Db2 data sources only, you can create a data map if you have user-defined or multi-field columns for which you want to manipulate data before loading it to the target.
From PowerCenter, you run a CDC workflow and session that extracts and applies change data. To define a data source in PowerCenter, you can import the extraction map or import the table definition from the source database through PowerExchange. For Db2 only, you can import a Db2 data map instead of the extraction map. In most situations, Informatica recommends that you import the extraction map.
Also, you must define a mapping, session, and workflow in PowerCenter. Optionally, you can include transformations in the mapping to manipulate the change data. When you define a CDC session, you must specify a connection type. The connection type determines the extraction mode and access method that PowerExchange uses to extract data.
To extract change data directly from Db2 transaction logs, the Microsoft SQL Server distribution database, the MySQL binary log, Oracle redo log files, or PostgreSQL replication slot, you must use the real-time extraction mode. To extract change data from PowerExchange Logger log files, you can use either the batch extraction mode or continuous extraction mode.
The following table describes these extraction modes:
Extraction Mode
| Description
|
Real-time extraction mode
|
Reads change data directly from the database log files in near real time, on an ongoing basis. When the PowerExchange Listener receives an extraction request, it pulls the change data from the log files and transmits the data to PowerCenter for extraction and apply processing. This mode provides the lowest latency for change data extraction but potentially the highest impact on system resources.
|
|
Reads change data from PowerExchange Logger log files that are in a closed state when an extraction request is made. After processing the log files, the extraction request ends. This mode provides the highest latency for change data extraction but minimizes the impact on system resources.
|
Continuous extraction mode
|
Reads change data continuously from open and closed PowerExchange Logger log files in near real time. This mode also minimizes database log accesses and the log retention period that is required for CDC.
|
To initiate change data extraction and apply processing, run a CDC workflow and session from PowerCenter.
During extraction processing, PowerExchange extracts changes from the change stream in chronological order based on the unit of work (UOW) end time. PowerExchange passes only the successfully committed changes to PowerCenter for processing. PowerExchange does not pass ABORTed or UNDO changes. If you are capturing changes from Db2 database logs or Oracle redo logs, changes that were contiguous in the change stream might not be contiguous in the reconstructed UOW that PowerExchange passes to PowerCenter.
To properly resume extraction processing, PowerExchange maintains restart tokens for each source table. Restart tokens are used for all extraction modes. To generate current restart tokens, you can use the PowerExchange Navigator, the special override statement in the restart token file, or the DTLUAPPL utility.