Table of Contents

Search

  1. Abstract for Profiling Sizing Guidelines
  2. Supported Versions
  3. Profiling and Discovery Sizing Guidelines

Profiling and Discovery Sizing Guidelines

Profiling and Discovery Sizing Guidelines

Profiling Warehouse Guidelines for Foreign Key and Overlap Discovery

Profiling Warehouse Guidelines for Foreign Key and Overlap Discovery

The disk space for foreign key discovery and overlap discovery is dependent on the number inferred foreign keys and overlapping column pairs. These items take up large space in the profiling warehouse if you set a large number for foreign key discovery and overlap discovery.
You can use the following formulas to compute the disk space. The Profiling Service Module computes column signatures one time for foreign key discovery and overlap discovery.
Signatures
Number of Columns in Schema X 3600
Where
  • Number of Columns in Schema is the total number of columns in the profile model. After the Profiling Service Module generates the column signature for a profile task, subsequent profile tasks reuse the signature.
  • 3600 is the amount of space required to store the signatures for one column.
Foreign Keys
Number of Foreign Keys X 2 X (Average Number of Key Columns) X 32 + Number Of Foreign Keys X ( 32 + (2 Bytes for Each Character X Average Column Size ) X Average Number of Key Columns X Average Number of Violating Rows
Where
  • Number of Foreign Keys is the number of inferred foreign keys.
  • Average Number of Key Columns is the average number of columns in the primary or foreign key.
  • The value 2 is the multiplier to get the total number of columns for the foreign key.
  • The value 32 is the number of bytes required to store one column in the key.
  • Average Column Size is the average number of characters in the columns if the numbers and dates are converted to the String datatype.
  • The value 2 Bytes for Each Character is the typical number of bytes for a single Unicode character.
  • Average Number of Violating Rows is the average number of rows that violate the foreign key either in the parent table or child table.
Overlap Discovery
Number Of Overlap Pairs X 2 X 32
Where
  • Number of Overlap Pairs is the number of inferred overlap pairs.
  • The value 2 is the number of columns in the pair.
  • The value 32 is the number of bytes required to store one column in the overlap pair.

0 COMMENTS

We’d like to hear from you!