Table of Contents

Search

  1. Abstract for Profiling Sizing Guidelines
  2. Supported Versions
  3. Profiling and Discovery Sizing Guidelines

Profiling and Discovery Sizing Guidelines

Profiling and Discovery Sizing Guidelines

Profiling Warehouse Usage Worksheet

Profiling Warehouse Usage Worksheet

To estimate the profiling warehouse usage of enterprise data discovery, you need to consider the estimates you use for non-enterprise discovery jobs as well.
In the Value column of the following worksheets, enter the values based on the formula in the Calculation column. You can then estimate the profiling warehouse tablespace and use it along with the estimates for non-enterprise discovery jobs.
Column Profiles
Use the following worksheet to record the values for column profiles:
Metric
Calculation
Value
Average profile size
NC* X [((2 X AVL*) + 64) X (MVF* X AC*)]
Total number of tables for all the enterprise discovery profiles
-
*Metrics have the following definitions:
  • NC. The average number of columns across all tables that you run a profile on.
  • AVL. The average length of a value in characters across all columns and all tables that you run a profile on.
  • MVF. The maximum number of value frequencies that the profiling warehouse saves for each column. The default value is 16000.
  • AC. The cardinality of a column is the number of unique values in each column expressed as a percentage. This is the average cardinality across all columns and all tables that you run a profile on.
Calculation
To calculate the required profiling warehouse tablespace for a column profile, multiply both the values in the Value column.
Data Domain Discovery
Use the following worksheet to record the values for data domain discovery:
Metric
Calculation
Value
Average data domain size
NC* X 254
Total number of tables for all the enterprise discovery profiles
-
*NC. The average number of columns across all tables that you run a profile on.
Calculation
To calculate the required profiling warehouse tablespace for data domain discovery, multiply both the values in the Value column.
Primary Key Discovery
Use the following worksheet to record the values for primary key discovery:
Metric
Calculation
Value
Average key result size
APK* X [(128 + (32 X AVL*))]
Total number of tables for all the enterprise discovery profiles
-
*Metrics have the following definitions:
  • APK. The average number of primary keys for each Primary Key Discovery profile. A general guideline is to set this parameter to 100.
  • AVL. The average length of a value in characters across all columns and all tables that you run a profile on.
Calculation
To calculate the required profiling warehouse tablespace for primary key discovery, multiply both the values in the Value column.
Foreign Key Discovery
Use the following worksheet to record the values for foreign key discovery:
Metric
Calculation
Value
Signatures
(NC* X NT*) X 3600
Foreign keys
(FKT* X NT*) X [(224 + (2048 X AVL*)]
*Metrics have the following definitions:
  • NC. The average number of columns across all tables that you run a profile on.
  • NT. The total number of tables for all enterprise discovery profiles.
  • FKT. The estimated average number of foreign keys for each table.
  • AVL. The average length of a value in characters across all columns and all tables that you run a profile on.
Calculation
To calculate the required profiling warehouse tablespace for foreign key discovery, add both the values in the Value column.
Enterprise Discovery
Calculation
To calculate the total profiling warehouse tablespace for enterprise discovery, add the tablespace for non-enterprise discovery jobs to the total of the tablespace values for the following profile job types:
  • Column profile
  • Data domain discovery
  • Primary key discovery
  • Foreign key discovery

0 COMMENTS

We’d like to hear from you!