Preface
Data Replication Overview
- Product Overview
- Data Replication Usage Scenarios
- Data Replication Sources and Targets
- Data Replication Architecture and Components
- Stages of Replication Processing
- Alternative Deployment Topologies
- Replication Configurations
Understanding Data Replication
- Overview
- Replicating Source Tables That Do Not Have a Primary Key Definition
- Start Point for the Extractor Task
- Apply Processing
- Checkpointing
- Recovery
  - Recovery Tables
- Tcl and SQL Scripts for Advanced InitialSync and Applier Processing of Data
- Database Character Set Conversion
- Replication of Database-Generated Values
- Generated Virtual Indexes
- Datatype Conversion Rules
- DDL Replication
Sources - Preparation and Replication Considerations
- DB2 for Linux, UNIX, and Windows Sources
- Microsoft SQL Server Sources
- MySQL Sources
- Oracle Sources
Targets - Preparation and Replication Considerations
- Amazon Redshift Targets
- Apache Kafka Targets
  - Preparing Apache Kafka Target Systems
  - Replication Considerations for Apache Kafka Targets
- Cloudera and Hortonworks Targets
  - Preparing Cloudera and Hortonworks Target Systems
- DB2 for Linux, UNIX, and Windows Targets
- Greenplum Targets
- MemSQL Targets
- Microsoft SQL Server Targets
- MySQL Targets
- Netezza Targets
- Oracle Targets
- PostgreSQL Targets
- Teradata Targets
- Vertica Targets
Starting the Server Manager
- Overview
- Installing the Server Manager as a Service on Windows
- Starting the Server Manager as a Windows Service
- Starting the Server Manager as a Daemon on Linux or UNIX
- Manually Starting the Server Manager
- Stopping a Server Manager Service or Daemon
- Uninstalling the Server Manager Service on Windows
Getting Started with the Data Replication Console
- Data Replication Console Interface
- Starting the Data Replication Console
Defining and Managing Server Manager Main Servers and Subservers
- Server Manager Main Server and Subservers
- Defining the Main Server and Its Subservers
- Editing Connection Information for a Main Server or Subserver
- Editing Microsoft SQL Server Instance Settings
- Editing Properties for the Main Server or a Subserver
- Viewing Information About the Server Manager System
- Configuring the Server Manager for HTTPS Communication
- Configuring the Server Manager Main Server to Run with NAT
- Associating a Subserver with Another Main Server
- Deleting Subservers
Creating and Managing User Accounts
- User Account Overview
- Users and Privileges
- Server Manager Security Policies
- Creating a User Account
- Changing the Password for Your User Account - Replication User
- Changing the Password for a User Account - idradmin User
- Unlocking a User Account
- Resetting the Password for the idradmin Account
Creating and Managing Connections
- Connections Overview
- Creating a Source or Target Connection from the Server Manager Tab
- Editing a Source or Target Connection
- Assigning a Different Source or Target Connection to a Configuration
Creating Replication Configurations
- Replication Configuration
- Task Flow: Creating a Replication Configuration
- Defining the Source Database
  - Configuring a Connection to an Oracle ASM Instance
  - Configuring Connections to Oracle RAC Sources for High Availability
    - Connecting to an Oracle RAC by Using Custom Connection Strings
    - Connecting to an Oracle RAC by Using a Virtual IP Address or Host Name
- Defining the Target Database
- Generating Target Tables and Audit Log Tables
- Generating Avro Schemas for Apache Kafka Consumers
- Handling Source Tables with Long Table or Column Names
  - Strategies for Handling Long Table Names
    - Editing the Audit Log Table Suffix
    - Manually Editing Target Table Names in an SQL Script
  - Strategies for Handling Long Column Names
- Mapping Source and Target Tables
- Defining Source Table Indexes
- Enabling Replication of DDL Changes at the Schema and Table Levels
- Customizing Apply Settings for Target Tables
- Configuring the Start Points for Extractor and Applier Tasks
- Configuring Conflict Resolution
  - Example of Configuring a MAXIMUM Resolution Strategy for Update Conflicts
  - Example of Configuring Custom Conflict Resolution
- Customizing Column Mappings
  - Filtering Column Data
- Adding Tcl and SQL Expressions
- Specifying the Database Logs from Which to Extract Data
- Configuring Runtime Settings
- Configuring Message Logging
- Saving Replication Configurations to the Main Server Manager
Materializing Targets with InitialSync
- InitialSync Overview
- Source Connectivity for Data Unload Operations
- Target Connectivity for Data Load Operations
- Sync Point Value
- InitialSync Handling of Target Table Constraints
- Considerations for Running InitialSync
- Task Flow: Using InitialSync to Materialize a Target
Scheduling and Running Replication Tasks
- Methods of Running Replication Tasks
- Types of Replication Tasks
- Schedule and Task Statuses
- Conflicting Replication Tasks
- Running Replication Executables Manually from the Data Replication Console
- Scheduling Replication Tasks
Implementing Advanced Replication Topologies
- Advanced Replication Topologies
- Configuring Continuous Replication
- Configuring Data Replication from One Source to Multiple Targets
- Configuring Bidirectional Replication
- Configuring Cascade Replication
- Loopback Avoidance for Replicated Data
Monitoring Data Replication
- Types of Monitoring Information
- Replication Statistics
- Intermediate Files
- Task Execution Logs
- Server Manager Logs
  - Viewing Server Manager Logs
- User Notifications
- Skipped Transaction Records
- Managing Open Transactions
Managing Replication Configurations
- Configuration Management Tasks
- Switching to Read or Edit Mode for a Replication Configuration
- Editing a Replication Configuration
- Changing the Server Manager Associated with a Replication Configuration
- Clearing User Replication Settings
- Managing Database Supplemental Logging
- Deploying Replication Configurations
- Generating a Reverse-Replication Configuration
- Exporting a Configuration File
- Importing a Configuration File
- Cleaning Replication Processing Information for a Configuration
  - Performing a Clean Operation on a Configuration
- Viewing Earlier Revisions of a Replication Configuration
- Viewing a List of Processed Database Logs
Handling Replication Environment Changes and Failures
- Manually Changing the Source Table Structure After Running Data Replication
- Updating an Avro Schema for Kafka Targets After Running Data Replication
- Adding Table Mappings Manually After Running Data Replication
- Resuming Replication After Upgrading a DB2 for Linux, UNIX, and Windows Source Database
- Resuming Replication After Upgrading a Microsoft SQL Server Source Database
- Handling Applier Failures
Troubleshooting
- Collecting Diagnostic Data for Troubleshooting
- Common Replication Problems
Data Replication Files and Subdirectories
- Files and Subdirectories
- Data Replication Script Files
- Executables Called from the Data Replication Console or Scripts
- Default.cfg File
- Other Key Files
- Subdirectories
Data Replication Runtime Parameters
Command Line Parameters for Data Replication Components
- About Command Line Parameters
- Command Line Parameters for InitialSync
- Command Line Parameters for the Extractor
- Command Line Parameters for the Applier
- Command Line Parameters for the Server Manager
Updating Configurations in the Replication Configuration CLI
- Replication Configuration CLI Overview
- Updating Source and Target Metadata for a Replication Configuration in the CLI
  - Updating Source and Target Metadata for a Configuration in Interactive Mode
  - Updating Source and Target Metadata for a Configuration in Non-interactive Mode
- Replication Configuration CLI Commands
DDL Statements for Manually Creating Recovery Tables
Sample Scripts for Enabling or Disabling SQL Server Change Data Capture
- Microsoft SQL Server Enterprise Edition
Glossary
- Applier
- Applier task
- apply cycle
- Audit Apply
- audit log table
- bidirectional replication
- binary log
- calculated columns
- cascade replication
- change data capture
- checkpoint processing
- Command Line Interface
- configuration file
- continuous replication
- Copy File task
- data files
- data warehouse appliance
- Edit mode
- External task
- Extractor
- Extractor task
- Flat File target
- global transaction
- heterogeneous replication
- Informatica Data Replication Console
- initial materialization
- InitialSync
- InitialSync task
- intermediate files
- log coordinates
- loopback avoidance
- Merge Apply
- Microsoft SQL Server Backup task
- primary target
- CDC Publisher
- Read mode
- recovery table
- Replication Configuration Command Line Interface
- replication configuration file
- replication statistics
- replication tasks
- routing
- secondary target
- Send File task
- Server Manager
- Server Manager Command Line Interface
- Server Manager Main server
- Server Manager subserver
- SQL Apply
- SQL Script Engine
- staging table
- Start Point
- subtask threads
- supplemental logging
- Sync Point
- Tcl expression
- Tcl Script Engine
- transaction files
- transactional replication
- virtual column
- virtual index

User Guide

9.8.0 HotFix 2

Back Next

Considerations for Running InitialSync

Review the following information about running the InitialSync task:

Typically, you use InitialSync to materialize empty target tables. If a target table is not empty, you can delete all of the data from the table by using the SQL TRUNCATE TABLE statement. If you use the ODBC driver to load data, you can set the

initial.check_empty_tables

runtime parameter to 1 or 2 to check if the mapped tables are empty.

If you run InitialSync from the command line, you can use the following command line parameters to selectively synchronize target tables:

Set the RESYNC parameter to P to synchronize the tables that are not currently synchronized.

Specify the DEST_TABLES or EXCLUDE_DEST_TABLES parameters to filter the tables to synchronize. In this case, you must set the RESYNC parameter to N.

You can improve InitialSync performance by using multiple threads. You can specify a number of InitialSync threads in the

InitialSync threads

field on the

Runtime Settings

tab >

General

view. A single InitialSync thread loads data to a single target table.

For a particular configuration, you can run InitialSync and the Extractor at the same time if the set of tables for which you run InitialSync does not intersect with the set of tables for which you run the Extractor.

Because DB2 for Linux, UNIX, and Windows does not provide a way to retrieve the current LSN from the database, InitialSync inserts a record for each mapped target table in a service table. Data Replication creates the service table with the default name of DBSYNC_SYNC_LSN in the DB2 source database. After you start change data capture, the Extractor uses these records to determine the initial LSN to pass to the Applier. To keep source data consistent, InitialSync locks each source table to prevent write access.

For Microsoft SQL Server targets, InitialSync uses the snapshot transaction isolation level to provide data consistency. InitialSync records the LSN value of the source data unload transaction.

For Microsoft SQL Server targets on Windows, InitialSync might run out of memory when it uses the Bulk Copy Program (BCP) to load a large amount of LOB data to the target tables.

Informatica recommends that you use the ODBC driver for initial materialization of Microsoft SQL Server target tables when you need to load a large amount of LOB data to the tables. To use the ODBC driver, run InitialSync with the DIRECT=n command line parameter.

If you want to use BCP to materialize tables with a large amount of LOB data, decrease the number of InitialSync threads and the value of the

global.lob_truncation_size

runtime parameter to avoid running out of memory.

For Oracle sources, you can use Informatica Fast Clone instead of InitialSync to materialize target tables. Fast Clone provides better performance. For more information, see the Informatica Fast Clone documentation.

For most Oracle sources, you can use multiple InitialSync subtask threads to unload data. To enable InitialSync multithreaded processing, use the

initial.oracle.parallel_subtask_limit

and

initial.oracle.parallel_sample_percentage

runtime parameters.

If you use the ODBC drivers to connect to targets, InitialSync supports multithreaded load processing for all target types.

If you use the target native load utilities, InitialSync supports multithreaded load processing for target types other than the following unsupported types:

Oracle

Teradata

Also, InitialSync does not support multithreaded processing for the following table types:

Oracle source tables that have subpartitions.

Source tables that have virtual columns with associated Tcl scripts or SQL expressions.

If you unload data from these types of tables, ensure that the

initial.oracle.parallel_sample_percentage

runtime parameter is set to 0.

For Microsoft SQL Server targets, if you use the BCP utility to load data, verify that the BCP requirements for parallel processing of a table are met. For more information, see the Microsoft SQL Server documentation at the following website: http://msdn.microsoft.com/en-us/library/aa196739(v=sql.80).aspx.

Do not use InitialSync with the following target types, for which initial synchronization is not needed:

Apache Kafka

Cloudera

Hortonworks

Flat File

For Microsoft SQL Server and Oracle sources, before you run InitialSync for a table that was added to an existing replication configuration, ensure that the table does not contain uncommitted data from an open transaction.

If you run InitialSync for a source table that contains uncommitted data and then start the Extractor, the Extractor might not process the change data from the open transaction after the transaction is committed.

Rename Saved Search

Table of Contents

User Guide

User Guide

Considerations for Running InitialSync

Considerations for Running InitialSync