Informatica Data Quality
- Informatica Data Quality H2L
- All Products
Profile Job Type
| Effect
| Description
|
---|---|---|
Column profile
| >= 2X
| Sorting is the major component and scaling is not exactly 2X. However, for certain use cases, other components make it closer to linear scaling.
|
Data domain discovery
| >= 2X
| For flat file data sources, scaling is exactly 2X. However, similar to a column profile, scaling might be more than 2X.
|
Key discovery
| ~ 2X
| Usually, scaling is linear. However, scaling is dependent on data and the complexity of relationships in the data.
|
Functional dependency discovery
| ~ 2X
| Usually, scaling is linear. However, scaling is dependent on data and the complexity of relationships in the data.
|
Overlap discovery
| Step 1. 2X
Step 2. Constant
| The first step of computing the signatures is directly proportional to the number of rows. The second step takes the same amount of time.
|
Foreign key discovery
| Step 1. 2X
Step 2. Constant
| The first step of computing the signatures is directly proportional to the number of rows. The second step takes the same amount of time.
|
Enterprise discovery
| ~ 2X
| Enterprise discovery is a mixture of column profiling, data domain discovery, key discovery, and foreign key discovery. Enterprise discovery scales as the average of these functions.
|
Profile Job Type
| Effect
| Description
|
---|---|---|
Column profile
| 2X
| Columns are independent of each other.
|
Data domain discovery
| 2X
| Columns are independent of each other.
|
Key discovery
| 2 to the power of X
| Sometimes, key discovery must compare all combinations of columns to find the keys. The effect is exponential in the number of columns.
|
Functional dependency discovery
| 2 to the power of X
| Sometimes, key discovery must compare all combinations of columns to find the keys. The effect is exponential in the number of columns.
|
Overlap discovery
| Step 1. 2X
Step 2. X to the power of 2
| The first step of computing the signatures is directly proportional to the number of columns. The first step is linear scaling. The second step relies on comparing all columns with other columns and scales as the square of the number of columns. The second step runs faster than the first step.
|
Foreign key discovery
| Step 1. 2X
Step 2. Constant
| The first step of computing the signatures is directly proportional to the number of columns. The first step is linear scaling. The second step relies on comparing all columns with other columns and scales as the square of the number of columns. The second step runs faster than the first step.
|
Enterprise discovery
| ~ 2X
| Enterprise discovery is a mixture of column profiling, data domain discovery, key discovery, and foreign key discovery. Enterprise discovery scales as the average of these functions.
|