Table of Contents

Search

  1. Preface
  2. Data Replication Overview
  3. Understanding Data Replication
  4. Sources - Preparation and Replication Considerations
  5. Targets - Preparation and Replication Considerations
  6. Starting the Server Manager
  7. Getting Started with the Data Replication Console
  8. Defining and Managing Server Manager Main Servers and Subservers
  9. Creating and Managing User Accounts
  10. Creating and Managing Connections
  11. Creating Replication Configurations
  12. Materializing Targets with InitialSync
  13. Scheduling and Running Replication Tasks
  14. Implementing Advanced Replication Topologies
  15. Monitoring Data Replication
  16. Managing Replication Configurations
  17. Handling Replication Environment Changes and Failures
  18. Troubleshooting
  19. Data Replication Files and Subdirectories
  20. Data Replication Runtime Parameters
  21. Command Line Parameters for Data Replication Components
  22. Updating Configurations in the Replication Configuration CLI
  23. DDL Statements for Manually Creating Recovery Tables
  24. Sample Scripts for Enabling or Disabling SQL Server Change Data Capture
  25. Glossary

Applier Processing of Intermediate Files

Applier Processing of Intermediate Files

The Applier run might include multiple apply cycles. During an apply cycle, the Applier processes intermediate files and commits the changes to the target at the end of the cycle.
During each apply cycle, the Applier processes one or more intermediate files depending on the apply.process_intermediate_size_per_job parameter value. This parameter determines the number of intermediate files that the Applier processes during an apply cycle. If this parameter is set to 0, the Applier processes all available intermediate files. If this parameter is set to a value greater than 0, the parameter specifies the maximum total size of all intermediate files, in megabytes, that the Applier processes during a single apply cycle. Data Replication always processes entire intermediate files. Data Replication never splits an intermediate file to avoid exceeding the maximum total size that is specified in this parameter. You control the maximum size of a single intermediate file by setting the
Maximum size of each intermediate file
option on the
Runtime Settings
tab >
General
view.
An intermediate file is composed of a data file (.dat) and a transaction file (.trn). The .trn files contain transaction metadata. The .dat files contain the transaction data changes and can be very large.
When the Applier processes the intermediate files during an apply cycle, it looks for a commit in the .trn files. After the Applier encounters a commit in a .trn file, the Applier starts reading the corresponding .dat files. If the Applier does not encounter a commit in the .trn files during the current apply cycle, the Applier queues the corresponding .dat files. Then, whenever the Applier encounters a commit during a subsequent apply cycle, it processes all of the queued .dat files.
When the Applier processes a .dat file, it applies all committed transactions to the target. The Applier accumulates changes that belong to open transactions in memory buffers. Change data for each long-running transaction is stored in a separate buffer. After the Applier encounters a commit for a long-running transaction during a subsequent apply cycle, the Applier applies the data from the corresponding buffer to the target database and then clears the buffer.
For target data warehouse appliances that restrict the number of load connections, the Applier concurrently loads data to a batch of target tables. The number of tables in a batch cannot exceed the number of available Applier threads. For the remaining tables, the Applier accumulates change data from each source table in a separate memory buffer. After the Applier loads the data for the batch of tables to the target, it reads the data for the next batch of tables from the corresponding buffers. After the Applier loads data to all of the target tables, it commits the changes and finalizes the apply cycle.
The apply.buffer_size_for_split_records runtime parameter specifies the maximum size of the buffers that accumulate data for long-running transactions and target tables. If the amount of data in the buffer exceeds the specified limit, the Applier flushes the data from the buffer to a temporary spill file in the
DataReplication_installation
/output/
configuration_name
/tmp
directory. The Applier then writes subsequent changes to the spill file instead of to the buffer. The spill file names have the following formats:
  • For long-running transactions:
    configuration_name
    _
    transaction_xid
    .spill
  • For target tables:
    configuration_name
    _
    table_id
    .spill
Because the spill files can be large, the Applier does not flush existing spill files to disk when taking checkpoints that are used to resume processing after an outage. Consequently, when the Applier restarts, it deletes existing spill files that might be incomplete and re-reads the intermediate files to process all of the records in a long-running transaction.

0 COMMENTS

We’d like to hear from you!