Service Group Application Reference

Service Group Application Reference

Physical Data Organization

Physical Data Organization

This section provides information about the physical data organization.

The SSA-NAME3 "Key" File or Table

Because SSA-NAME3 potentially generates multiple keys per name or address, most databases will require that a separate file or table be setup to contain, at the very least, the SSA-NAME3 key and an Id-No to refer back to the source record.
In many systems, a design for this separate file or table that also redundantly carries the names and other identity data used for matching or display, will most likely optimize the physical I/O due to the elimination of multiple file accesses or database joins. That is, this will allow the search, matching and display to be achieved by processing a single file/table only.
In some cases, it may also be possible to optimize physical I/O by declaring an index on the concatenation of the SSA-NAME3 key, name and other identity fields. This will allow the search, matching and display to be achieved by index only processing. Some databases which support repeating field structures can do this without the need for an extra file or table.
For more information on this de-normalized table design, refer to the Introduction for Application Programmers section of the
INTRODUCTION TO SSA-NAME3 Guide
.

Optimizing the SSA-NAME3 Key Access

The SSA-NAME3 storage keys are designed so that high volume files that are necessarily low in update activity can benefit by loading the file such that the logical and physical sequence is the SSA-NAME3 key.
If other access requirements conflict, then at least the index can be "clustered" or sequenced logically.
These observations make it potentially damaging to apply a hashing algorithm, or even a bit truncation algorithm, to the key. The key is designed to optimize a very badly skewed search problem, care should be exercised in any further physical optimization.

Optimizing the SSA-NAME3 Key Load Process

The process of populating a database table with SSA-NAME3 keys will, in most database environments, be more efficient if the database’s loader utility is used, rather than using record level inserts to the database. This is more evident the greater the volume of records to be keyed.
Bulk key-load applications can be designed to write flat files of keys and data in a format for loading to the database using its loader utility. For more information on the key-building application, refer to the Introduction for Application Programmers section of the
INTRODUCTION TO SSA-NAME3 Guide
.
After creating the file of keys and data, and before running the database’s loader utility, the file should be sorted on the SSA-NAME3 key to improve access at search time. In some databases, this may also improve load performance.
If a large number of records are to be sorted, choose an efficient sort, making good use of memory and distributed sort work files.
Some database systems also allow indexing of key fields after the file data has been loaded, and this may be more efficient than building the index dynamically as the file is being loaded.
Bulk-loader programs will also normally work more efficiently if their input is a flat file, rather than a database table. When reading and writing flat files, further optimization can be gained by increasing the block or cache size of the input and output data files.
Client-server systems should avoid performing bulk-loads across the network. Rather, a server based program will usually be more efficient.
When extreme volumes of data are to be keyed, try to create multiple concurrent instances of the keybuilding process which process non-overlapping partitions of the input data. These can then be put back together at sort time. Care should be taken, however, that the CPU or I/O subsystem are not already overloaded.
If the opportunity exists to off-load the key-building work to a more efficient or less busy processor, such as a powerful computer, the overall efficiency of the process may be easier to manage and predict.

0 COMMENTS

We’d like to hear from you!