Data Profiling
- Data Profiling
- All Products
Option
| Recommendations
|
---|---|
Maximum Number of Value Frequency Pairs
| Default is 500. Decrease or increase this value based on the business need.
You can set the maximum number of value frequency pairs to no more than 10,000 for each data profiling task.
|
Maximum Number of Patterns
| Default is 10. Decrease or increase this value based on the business need.
|
Pattern Threshold Percentage
| Default is 5. Decrease or increase this value based on the business need.
|
Infer Date and Time
| By default,
Data Profiling infers the date and time for a column of date or time data type. Clear this option if you do not want to infer the date and time for a column of date or time data type in the data source.
Data Profiling performance might be impacted because it consumes a lot of resources to infer date and time.
|
Detect Outliers
| By default, outliers are detected in the profile results. Clear this option if you do not want to detect and view outliers in the data source.
|
Minimum Number of Rows for Split Process per Column
| Default is 100,000,000. Increase or decrease this value based on the business need. Row-based criteria uses this option to optimize performance.
For example, if you set the value to 100,000 and the number of rows in the source object is 100,500 and the columns is 30,
Data Profiling creates 30 subtasks for each column in the source object.
|
Maximum Number of Columns per Mapping*
| Default is 50. Increase or decrease this value based on the business need. Column-based criteria uses this option to optimize performance.
For example, you set the value to 30 and
Minimum Number of Rows for Split Processing per Column value to 100,000,000. If the source object contains 149 columns and 70,000 rows.
Data Profiling creates a subtask for each 30 columns, which results in five subtasks. Four subtasks contain 30 columns each, and one subtask contains 29 columns.
|
Maximum Memory per Mapping
| Default is 512 MB. Increase or decrease this value based on the business need.
|
Default buffer block size
| Default is Auto. Enter a numeric value and append KB, MB, or GB to the value to increase or decrease the value based on the business need.
|
DTM buffer size
| Default is Auto. Enter a numeric value and append KB, MB, or GB to the value to increase or decrease the value based on the business need.
By default, a minimum of 12 MB is allocated to the buffer at run time.
You might increase the DTM buffer size in the following circumstances:
|
Line Sequential Buffer Length
| Default is 1024. Increase the value if the source flat file records are larger than 1024 bytes.
|
* The mapping is a type of subtask.
Data Profiling creates and runs subtasks for a
data profiling task to process the data concurrently.
|