The Applier run might include multiple apply cycles. During an apply cycle, the Applier processes intermediate files and commits the changes to the target at the end of the cycle.
During each apply cycle, the Applier processes one or more intermediate files depending on the apply.process_intermediate_size_per_job parameter value. This parameter determines the number of intermediate files that the Applier processes during an apply cycle. If this parameter is set to 0, the Applier processes all available intermediate files. If this parameter is set to a value greater than 0, the parameter specifies the maximum total size of all intermediate files, in megabytes, that the Applier processes during a single apply cycle. Data Replication always processes entire intermediate files. Data Replication never splits an intermediate file to avoid exceeding the maximum total size that is specified in this parameter. You control the maximum size of a single intermediate file by setting the
Maximum size of each intermediate file
option on the
Runtime Settings
tab >
General
view.
An intermediate file is composed of a data file (.dat) and a transaction file (.trn). The .trn files contain transaction metadata. The .dat files contain the transaction data changes and can be very large.
When the Applier processes the intermediate files during an apply cycle, it looks for a commit in the .trn files. After the Applier encounters a commit in a .trn file, the Applier starts reading the corresponding .dat files. If the Applier does not encounter a commit in the .trn files during the current apply cycle, the Applier queues the corresponding .dat files. Then, whenever the Applier encounters a commit during a subsequent apply cycle, it processes all of the queued .dat files.
When the Applier processes a .dat file, it applies all committed transactions to the target. The Applier accumulates changes that belong to open transactions in memory buffers. Change data for each long-running transaction is stored in a separate buffer. After the Applier encounters a commit for a long-running transaction during a subsequent apply cycle, the Applier applies the data from the corresponding buffer to the target database and then clears the buffer.
For target data warehouse appliances that restrict the number of load connections, the Applier concurrently loads data to a batch of target tables. The number of tables in a batch cannot exceed the number of available Applier threads. For the remaining tables, the Applier accumulates change data from each source table in a separate memory buffer. After the Applier loads the data for the batch of tables to the target, it reads the data for the next batch of tables from the corresponding buffers. After the Applier loads data to all of the target tables, it commits the changes and finalizes the apply cycle.
The apply.buffer_size_for_split_records runtime parameter specifies the maximum size of the buffers that accumulate data for long-running transactions and target tables. If the amount of data in the buffer exceeds the specified limit, the Applier flushes the data from the buffer to a temporary spill file in the
DataReplication_installation
/output/
configuration_name
/tmp
directory. The Applier then writes subsequent changes to the spill file instead of to the buffer. The spill file names have the following formats:
For long-running transactions:
configuration_name
_
transaction_xid
.spill
For target tables:
configuration_name
_
table_id
.spill
Because the spill files can be large, the Applier does not flush existing spill files to disk when taking checkpoints that are used to resume processing after an outage. Consequently, when the Applier restarts, it deletes existing spill files that might be incomplete and re-reads the intermediate files to process all of the records in a long-running transaction.