Table of Contents

Search

  1. Preface
  2. Introduction
  3. Installation
  4. Design
  5. Operation

Large File Support

Large File Support

Some computer operating systems limit the maximum file size to 2GB. We will refer to these as "small systems". Other operating systems have native support for files larger than 2GB. We will refer to these as "large systems".
To overcome the file size limitation the Data Clustering Engine can combine the free space on a number of file systems into one large logical file . A logical file larger than 2GB is composed of a number of small files known as "extents ". Each extent must be less than 2GB in size.
Although this feature is not necessary on "large systems", it can be used to distribute the data over multiple file systems thereby making use of fragmented space.
Large File Support is designed for binary files such as the database and index files. Restrictions apply to its use for text files, as described later.

Directory File

The management of extents can default to internal rules or can be user-supplied. A file known as the "directory file " can be defined to specify the number of extents as well as their names and sizes.
Each large file’s directory file is named using OS-specific rules. Currently the file, if present, is the name of the large file with
.dir
appended. For example, the directory file for
match.db
would be called
match.db.dir
.
Directory files contain multiple lines of text. Each line contains two blank separated fields used to define an extent. The size of the extent (in bytes) is followed by the name of the extent. The maximum extent size is limited to 2GB - 1. An asterisk (*) may be used as a shorthand notation for the maximum extent size.
For example, these definitions define two extents. The first is limited to 1MB and the second extent defaults to 2GB - 1.
1048576match.db.ext1 *match.db.ext2
If all extents are to be of equal size, you can define a template for the base name of the extents. For example
048575match.db.ext *
will allocate extents of size 1048575 and name them using the rules documented in the
Default Extent Names
section below. Note that the second line containing the asterisk enables this type of processing. Also, this mode of processing requires the extent size (1048575 in this example) to be a power of 2 (1048576 in this example) minus 1. To allow all extents to have the maximum size of 2GB-1 use:
*match.db.ext *
To allow
all
large files to have maximum size extents, create a file called
extents.dir
with the following text:
*%f *
It is possible to set the maximum extent size using the environment variable,
SSAEXTENTSIZE
. Eg.
SSAEXTENTSIZE=256k
will limit the size of all extent to 256kB (minus 1).

Default Extent Names

If a directory file does not exist when a large file is opened, default rules are used to name the extents. Each extent size defaults to 2GB - 1.
The first extent has the same name as the large file. Second and subsequent extents are created by appending two characters to the file name. The extensions are named aa, ab, ac,. . . az, ba,. . . zz. This means that (1+26*26) extents are possible giving a maximum logical file size of 1.3TB.
Using the example above, the extents would be named
match.db match.db.extaa match.db.extab ...

Small System Rules

Small systems support large files using the rules above. Extents are defined using a directory file. If a directory file does not exist default sizes and names are used.

Large System Rules

Large systems have native support for files larger than 2GB. Operating systems such as Windows-NT 4.0, HPUX and Digital/Compaq Unix are in this category.
Large files do not use extents on these Projects unless a directory file is defined. In the latter case, extents are still limited to 2GB - 1.

Restrictions

Large file support was designed for binary files. In general,
text
files are not supported by the extent mechanism.
Text files do not default to use extents when a directory file is not present. Even if a directory file is defined and extents are used, correct results can not be guaranteed on every platform. If you wish to use large text files, you should use an operating system that supports them natively.
POST
will create a text file if either the Trim or
CR OUTPUT-OPTIONS
are used. If neither is specified, the output is binary (fixed length records) and can therefore use the extent mechanism.

0 COMMENTS

We’d like to hear from you!