Table of Contents

Search

  1. Preface
  2. Introduction
  3. Servers
  4. Console Client
  5. Search Clients
  6. Table Loader
  7. Update Synchronizer
  8. Globalization
  9. Siebel Connector
  10. Web Services
  11. ASM Workbench
  12. Cluster Merge Rules
  13. Forced Link and Unlink
  14. System Backup and Restore
  15. Batch Utilities

Performance

Performance

The Table Loader uses multiple threads to overlap its work. Multiple threads are used during the data extraction, key generation and DBMS load phases:
Reader
Reads source records from the database or input file and places them in a queue for the Key Generation threads to process.
Key Generation
Processes the source records to create IDX rows. There are n key generation threads by default, where n in the number of CPUs on the machine.
Writer
Writes the IDT and IDX rows to operating system files. These files are used as input to the DBMS Load utility. IDT rows are written directly to a flat-file. IDX rows are pushed into the MDM-RE Sort utility where they are sorted and written to an operating system file.
Loader
Threads merge sort files and run the DBMS load utilities to load the IDT and IDXs in parallel. There are m Loader threads by default, where m in the number of CPUs on the machine.
The Table Loader can be tuned by setting the size of the Reader’s input queue and the Writer’s sort buffers as well as the number of key generation and loader threads.

Input Queue

The size of the Reader’s input queue is set with the environment variable,
SSALDR_RBSIZE=nnn
where
nnn
is the number of records. The default value is 5000.
This parameter is also used to calculate the size of the key generation output queues. They are calculated as
SSALDR_RBSIZE / number_of_key_threads * 8
In order to keep the Key Generation and Writer threads busy, the input queue must be filled as quickly as possible.

Flat-File Input

When reading from a flat-file, the input queue can be filled very quickly, and in general, the bottleneck is in the Key Generation and/or Writer threads. Since the Writer thread blocks for a short period during sort processing, it is advantageous to have a large input queue (and therefore large key generation output queues), so that key generation can proceed concurrently.

Database Input

When the Reader’s input queue is filled from records from a database, the Reader thread is usually the bottleneck and the other threads spend time waiting for work.

Finding the bottleneck

A thread can wait for two reasons:
  • waiting for work in its input queue, or
  • waiting for space in its output queue (where it places its results)
To determine how often a thread had to wait, refer to the statistics in the Table Loader log file. When each thread ends, it reports the number of times it had to wait for work. For example,
Reader thread [1] ends. Records In 900000. Waits 960 Keygen thread [3] ends. Processed 450000. Waits 214 Keygen thread [4] ends. Processed 450000. Waits 214 Writer thread [2] Extract ends. IDT out 900000 Waits 448
The thread with the least number of "waits" is the busiest thread (bottleneck). In the example above, the Key Generation threads were the busiest. The Reader thread spent some time waiting for the Key threads to make room in Reader’s output queue. This is typical of a flat-file load.
When reading from a database, it in not uncommon for the Reader thread to report zero waits. That is, it was reading records as fast as the DBMS could deliver them and the other threads were able to keep up with the work load by keeping the input queue in a state where there was always enough room to add the incoming records.

Tuning

The objective is to make the input queue large enough to keep it from becoming the bottleneck.
If reading from the database and the reader thread reports 0 waits, the reader queue is long enough. If reading from a flat-file, the reader queue must be set large enough so that the key generation threads are the busiest (least waits).

0 COMMENTS

We’d like to hear from you!