Preface
Data Replication Overview
- Product Overview
- Data Replication Usage Scenarios
- Data Replication Sources and Targets
- Data Replication Architecture and Components
- Stages of Replication Processing
- Alternative Deployment Topologies
- Replication Configurations
Understanding Data Replication
- Overview
- Replicating Source Tables That Do Not Have a Primary Key Definition
- Start Point for the Extractor Task
- Apply Processing
- Checkpointing
- Recovery
  - Recovery Tables
- Tcl and SQL Scripts for Advanced InitialSync and Applier Processing of Data
- Database Character Set Conversion
- Replication of Database-Generated Values
- Generated Virtual Indexes
- Datatype Conversion Rules
- DDL Replication
Sources - Preparation and Replication Considerations
- DB2 for Linux, UNIX, and Windows Sources
- Microsoft SQL Server Sources
- MySQL Sources
- Oracle Sources
Targets - Preparation and Replication Considerations
- Amazon Redshift Targets
- Apache Kafka Targets
  - Preparing Apache Kafka Target Systems
  - Replication Considerations for Apache Kafka Targets
- Cloudera and Hortonworks Targets
  - Preparing Cloudera and Hortonworks Target Systems
- DB2 for Linux, UNIX, and Windows Targets
- Greenplum Targets
- Microsoft SQL Server Targets
- MySQL Targets
- Netezza Targets
- Oracle Targets
- PostgreSQL Targets
- Teradata Targets
- Vertica Targets
Starting the Server Manager
- Overview
- Installing the Server Manager as a Service on Windows
- Starting the Server Manager as a Windows Service
- Starting the Server Manager as a Daemon on Linux or UNIX
- Manually Starting the Server Manager
- Stopping a Server Manager Service or Daemon
- Uninstalling the Server Manager Service on Windows
Getting Started with the Data Replication Console
- Data Replication Console Interface
- Starting the Data Replication Console
Defining and Managing Server Manager Main Servers and Subservers
- Server Manager Main Server and Subservers
- Defining the Main Server and Its Subservers
- Editing Connection Information for a Main Server or Subserver
- Editing Microsoft SQL Server Instance Settings
- Editing Properties for the Main Server or a Subserver
- Viewing Information About the Server Manager System
- Configuring the Server Manager for HTTPS Communication
- Configuring the Server Manager Main Server to Run with NAT
- Associating a Subserver with Another Main Server
- Deleting Subservers
Creating and Managing User Accounts
- User Account Overview
- Users and Privileges
- Server Manager Security Policies
- Creating a User Account
- Changing the Password for Your User Account - Replication User
- Changing the Password for a User Account - idradmin User
- Unlocking a User Account
- Resetting the Password for the idradmin Account
Creating and Managing Connections
- Connections Overview
- Creating a Source or Target Connection from the Server Manager Tab
- Editing a Source or Target Connection
- Assigning a Different Source or Target Connection to a Configuration
Creating Replication Configurations
- Replication Configuration
- Task Flow: Creating a Replication Configuration
- Defining the Source Database
  - Configuring a Connection to an Oracle ASM Instance
  - Configuring Connections to Oracle RAC Sources for High Availability
    - Connecting to an Oracle RAC by Using Custom Connection Strings
    - Connecting to an Oracle RAC by Using a Virtual IP Address or Host Name
- Defining the Target Database
- Generating Target Tables and Audit Log Tables
- Generating Avro Schemas for Apache Kafka Consumers
- Handling Source Tables with Long Table or Column Names
  - Strategies for Handling Long Table Names
    - Editing the Audit Log Table Suffix
    - Manually Editing Target Table Names in an SQL Script
  - Strategies for Handling Long Column Names
- Mapping Source and Target Tables
- Defining Source Table Indexes
- Enabling Replication of DDL Changes at the Schema and Table Levels
- Customizing Apply Settings for Target Tables
- Configuring the Start Points for Extractor and Applier Tasks
- Configuring Conflict Resolution
  - Example of Configuring a MAXIMUM Resolution Strategy for Update Conflicts
  - Example of Configuring Custom Conflict Resolution
- Customizing Column Mappings
  - Filtering Column Data
- Adding Tcl and SQL Expressions
- Specifying the Database Logs from Which to Extract Data
- Configuring Runtime Settings
- Configuring Message Logging
- Saving Replication Configurations to the Main Server Manager
Materializing Targets with InitialSync
- InitialSync Overview
- Source Connectivity for Data Unload Operations
- Target Connectivity for Data Load Operations
- Sync Point Value
- InitialSync Handling of Target Table Constraints
- Considerations for Running InitialSync
- Task Flow: Using InitialSync to Materialize a Target
Scheduling and Running Replication Tasks
- Methods of Running Replication Tasks
- Types of Replication Tasks
- Schedule and Task Statuses
- Conflicting Replication Tasks
- Running Replication Executables Manually from the Data Replication Console
- Scheduling Replication Tasks
Implementing Advanced Replication Topologies
- Advanced Replication Topologies
- Configuring Continuous Replication
- Configuring Data Replication from One Source to Multiple Targets
- Configuring Bidirectional Replication
- Configuring Cascade Replication
- Loopback Avoidance for Replicated Data
Monitoring Data Replication
- Types of Monitoring Information
- Replication Statistics
- Intermediate Files
- Task Execution Logs
- Server Manager Logs
  - Viewing Server Manager Logs
- User Notifications
- Skipped Transaction Records
- Managing Open Transactions
Managing Replication Configurations
- Configuration Management Tasks
- Switching to Read or Edit Mode for a Replication Configuration
- Editing a Replication Configuration
- Changing the Server Manager Associated with a Replication Configuration
- Clearing User Replication Settings
- Managing Database Supplemental Logging
- Deploying Replication Configurations
- Generating a Reverse-Replication Configuration
- Exporting a Configuration File
- Importing a Configuration File
- Cleaning Replication Processing Information for a Configuration
  - Performing a Clean Operation on a Configuration
- Viewing Earlier Revisions of a Replication Configuration
- Viewing a List of Processed Database Logs
Handling Replication Environment Changes and Failures
- Manually Changing the Source Table Structure After Running Data Replication
- Updating an Avro Schema for Kafka Targets After Running Data Replication
- Adding Table Mappings Manually After Running Data Replication
- Resuming Replication After Upgrading a DB2 for Linux, UNIX, and Windows Source Database
- Resuming Replication After Upgrading a Microsoft SQL Server Source Database
- Handling Applier Failures
Troubleshooting
- Collecting Diagnostic Data for Troubleshooting
- Common Replication Problems
Data Replication Files and Subdirectories
- Files and Subdirectories
- Data Replication Script Files
- Executables Called from the Data Replication Console or Scripts
- Default.cfg File
- Other Key Files
- Subdirectories
Data Replication Runtime Parameters
Command Line Parameters for Data Replication Components
- About Command Line Parameters
- Command Line Parameters for InitialSync
- Command Line Parameters for the Extractor
- Command Line Parameters for the Applier
- Command Line Parameters for the Server Manager
Updating Configurations in the Replication Configuration CLI
- Replication Configuration CLI Overview
- Updating Source and Target Metadata for a Replication Configuration in the CLI
  - Updating Source and Target Metadata for a Configuration in Interactive Mode
  - Updating Source and Target Metadata for a Configuration in Non-interactive Mode
- Replication Configuration CLI Commands
DDL Statements for Manually Creating Recovery Tables
Sample Scripts for Enabling or Disabling SQL Server Change Data Capture
- Microsoft SQL Server Enterprise Edition
Glossary
- Applier
- Applier task
- apply cycle
- Audit Apply
- audit log table
- bidirectional replication
- binary log
- calculated columns
- cascade replication
- change data capture
- checkpoint processing
- Command Line Interface
- configuration file
- continuous replication
- Copy File task
- data files
- data warehouse appliance
- Edit mode
- External task
- Extractor
- Extractor task
- Flat File target
- global transaction
- heterogeneous replication
- Informatica Data Replication Console
- initial materialization
- InitialSync
- InitialSync task
- intermediate files
- log coordinates
- loopback avoidance
- Merge Apply
- Microsoft SQL Server Backup task
- primary target
- CDC Publisher
- Read mode
- recovery table
- Replication Configuration Command Line Interface
- replication configuration file
- replication statistics
- replication tasks
- routing
- secondary target
- Send File task
- Server Manager
- Server Manager Command Line Interface
- Server Manager Main server
- Server Manager subserver
- SQL Apply
- SQL Script Engine
- staging table
- Start Point
- subtask threads
- supplemental logging
- Sync Point
- Tcl expression
- Tcl Script Engine
- transaction files
- transactional replication
- virtual column
- virtual index

User Guide

9.7.0 HotFix 1

Back Next

Checkpointing

To provide for flexible recovery from both planned and unplanned exceptions that disrupt replication, Data Replication records checkpoint information separately for the Extractor, Server Manager, and Applier. By recording checkpoint information for each of these components, Data Replication can prevent data loss and ensure data consistency across all of the replication stages.

Data Replication stores checkpoint information in SQLite databases. Data Replication creates a separate SQLite database for the Extractor, Server Manager, and Applier task, where the task runs. These SQLite databases are in addition to the SQLite databases that Data Replication uses to store configuration file and internal information for the Extractor, Applier, and InitialSync processing.

SQLite is installed as part of the Data Replication installation. Data Replication creates and maintains all of its SQLite databases.

The following events trigger Extractor checkpoints:

The Extractor reaches the end of a log file, which can be an Oracle archive or online redo log, a MySQL binary log,

or a SQL Server backup file

The Extractor reads all available records from the source database logs.

In Continuous mode, the Extractor ends a microcycle of the duration that is specified in the

Continuous Replication Latency

option.

For Microsoft SQL Server sources, the Extractor reads a chunk of a log file. The chunk size is specified by the

extract.mssql.checkpoint_size

parameter.

For DB2 for Linux, UNIX, and Windows sources, the Extractor reads a chunk of a log file. The chunk size is specified by the

extract.db2.checkpoint_size

parameter.

For Oracle sources, the Extractor tries to read an online redo log that the Oracle database overwrote with new data.

For Oracle source instances in a RAC environment, an Oracle instance stops and the Extractor processes all of the redo logs in the thread.

The following events trigger Applier checkpoints:

The Applier commits data to the target.

The Applier uses subtask threads to apply changes, and a thread that applies a primary key update writes a commit to the target.

For Apache Kafka targets, the Applier saves the sequence of the last change operation successfully sent to the target as a "checkpoint" to a checkpoint file, provided that you use the default guaranteed delivery mode. If you do not use guaranteed delivery, the Applier writes a checkpoint after each Commit operation. The checkpoint file must exist on the system where the Applier, the CDC Publisher, and a Server Manager instance (Main server or subserver) run. You can change the checkpoint file name or directory by editing the apply.kafka.kafka_checkpoint_file_name and apply.kafka.kafka_checkpoint_file_directory runtime parameters. By default, the checkpoint file name matches the configuration name.

Rename Saved Search

Table of Contents

User Guide

User Guide

Checkpointing

Checkpointing