Data Profiling
- Data Profiling
- All Products
Option
| Description
|
---|---|
Maximum Number of Value Frequency Pairs
| Number of column values with the highest frequencies appear in the profile results. Default is 500.
For example, if you set the value to 100, only the top 100 values appear in the profile results.
If you do not want to save the value frequency information of a profile in the profiling warehouse, set the value to 0.
|
Maximum Number of Patterns
| Number of patterns with the maximum number of occurrences appear in the profile results. The rest of the patterns appear under the
category on the
Results area. Default is 10.
For example, if you set the value to 3, the top 3 patterns appear with their statistics, and the rest of the patterns are consolidated under the
Others category.
|
Pattern Threshold Percentage
| Maximum percentage of values used to derive a pattern in the profile results. Default is 5.
For example, when you set the value to 4, the patterns that are 4% and higher appear individually with their statistics and the rest of the patterns are consolidated under the
Others category.
|
Infer Date and Time
| Infers the date and time for a column of date or time data type. Default is Yes.
|
Detect Outliers
| Detects pattern and value frequency outliers in the source object. Default is Yes.
|
Minimum Number of Rows for Split Process per Column
| If the source object contains more rows than the minimum number of rows that you enter here,
Data Profiling uses one subtask for each source column when the profile is run. Default is 100,000,000.
|
Maximum Number of Columns per Mapping
| Number of columns for each mapping when the number of source rows is fewer than the
Minimum Number of Rows for Split Processing per Column value. Default is 50.
|
Maximum Memory per Mapping*
| Maximum amount of memory that you want to allocate for each mapping. Default is 512 MB.
|
Default buffer block size
| Size of buffer blocks used to move data blocks from sources to targets. Default is Auto.
Enter one of the following options:
|
DTM Buffer Size
| Amount of memory allocated to the task from the DTM process. Default is Auto.
By default, a minimum of 12 MB is allocated to the buffer at run time.
Use one of the following options:
|
Line Sequential Buffer Length
| Number of bytes that the task reads for each row in a flat file source. Default is 1024.
|
* The mapping is a type of subtask.
Data Profiling creates and runs for a
data profiling task to process the data concurrently.
|