Table of Contents

Search

  1. Abstract for Profiling Sizing Guidelines
  2. Supported Versions
  3. Profiling and Discovery Sizing Guidelines

Profiling and Discovery Sizing Guidelines

Profiling and Discovery Sizing Guidelines

Enterprise Discovery Resources

Enterprise Discovery Resources

When you estimate the enterprise discovery resources, you need to optimize running each individual profile quickly with additional resources and the costs and limitations of scaling the additional resources.
Enterprise discovery also requires evaluating the relational sources where the column profile SQL queries run. If you have many tables, enterprise discovery can quickly affect the performance of a relational source.
Relational Resources
You must limit the number of profiling queries to the number of CPU cores of the database or fewer than the number of cores. In relational data source profiles, the Profiling Service Module runs a profile on every column in a separate query to the relational source. The optimal run of enterprise discovery depends on the size of the relational source.
If you have lesser resources than the recommended guidelines, the time that the Profiling Service Module takes to finish the enterprise discovery job increases.
The following table indicates the recommended number of cores based on the number of database tables in a typical enterprise discovery job:
Number of Tables
CPU Cores
Memory
Concurrent Jobs
Less than 100
4
8 GB
3
Between 100 and 500
8
16 GB
5
Between 500 and 1000
16
32 GB
10
Between 1000 and 2000
32
64 GB
20
More than 2000
>=64
>=128 GB
40
Data Integration Service Resources
You can estimate the Data Integration Service resources for enterprise discovery in two ways. The first approach is to decide the level of concurrency for the number of tables in the enterprise discovery job. Then, you can use the worksheets for the specific mix of profile jobs.
The second approach is to use the general recommendations in the following table:
The following table lists the recommended number of cores based on the number of database tables in a typical enterprise discovery job:
Number of Tables
CPU Cores
Memory
Temporary Disk / Spindles
Concurrent Jobs
Less than 200
4
8 GB
20 GB / 1
3
Between 200 and 1000
8
32 GB
80 GB / 2
5
Between 1000 and 2000
16
64 GB
160 GB / 3
10
More than 2000
>=32
>=128 GB
>= 320 GB / 4
20
When you plan for a deployment that has both single-table discovery and enterprise discovery jobs, choose the maximum configuration between the two use cases.

0 COMMENTS

We’d like to hear from you!