Table of Contents

Search

  1. Preface
  2. Introduction
  3. Installation
  4. Design
  5. Operation

Tuning / Load Statistics

Tuning / Load Statistics

The Loader produces statistics that can be used to select appropriate values for repeat counts. For example:
Flatten:................................Denormalized-In 4196 Flatten: Unique-Keys 2094 Flatten: IDT Out 2894 Flatten: alt_ent_num [ 2] Flatten:................................ [ 0] 1585 75.69% Flatten: [ 1] 363 93.03% Flatten: [ 2] 71 96.42% Flatten: [ 3] 24 97.56% Flatten:................................ [ 4] 14 98.23% Flatten: [ 5] 9 98.66% Flatten: [ 6] 9 99.09% Flatten: [ 7] 3 99.24% Flatten:................................ [ 8] 2 99.33% Flatten: [ 9] 7 99.67% Flatten: [ 10] 3 99.81% Flatten: [ 11] 0 99.81% Flatten:................................ [ 12] 0 99.81% Flatten: [ 13] 0 99.81% Flatten: [ 14] 0 99.81% Flatten: [ 15] 2 99.90% Flatten:................................ [ 16] 0 99.90% Flatten: [ 17] 0 99.90% Flatten: [ 18] 1 99.95% Flatten: [ 19] 0 99.95% Flatten:................................ [ 20] 0 99.95% Flatten: [ 21] 0 99.95% Flatten: [ 22] 0 99.95% Flatten: [ 23] 0 99.95% Flatten:................................ [ 24] 0 99.95% Flatten: [ 25] 0 99.95% Flatten: [ 26] 0 99.95% Flatten: [ 27] 1 100.00%
This report shows that 4196 rows were created as a result of the denormalization (SQL join) process, while 2094 of those rows contained unique keys (
FLATTEN_KEYS=
). After flattening, 2894 rows were loaded to the IDT. This means that 800 rows overflowed to secondary rows because the number of repeats was insufficient to hold all occurrences.
The next part of the report shows a histogram for each repeating field. The field
alt_ent_num
was defined to have two repeats ([2]) for this load test. The histogram tabulates the actual number of repeats in each normalized row (prior to flattening) and a cumulative percentage of rows. For example, the line
Flatten: [ 1] 363 93.03%
states that 363 rows has only one occurrence, and this accounts for 93% of the total rows. The table also tells us that the maximum number of occurrences was 27, and that 1585 rows had no value for this field.
The cumulative percentage can be used to select an appropriate value for the number of repeats. For example, if
alt_ent_num
were defined to have six repeats, over 99% of values would fit into one IDT row (no overflows).

0 COMMENTS

We’d like to hear from you!