Table of Contents

Search

  1. Introduction to Data Discovery
  2. Data Discovery with Informatica Analyst
  3. Data Discovery with Informatica Developer
  4. Function Support Based on Profiling Warehouse Connection

Data Discovery Guide

Data Discovery Guide

Exclude Null Values

Exclude Null Values

You can exclude null values when you perform data domain discovery on a data source. When you select the minimum percentage of rows with the exclude null values option, the conformance percentage is the ratio of number of matching rows divided by the total number of rows minus the null values in the column.
The data domain discovery process differs when you choose the
Exclude null values from data domain discovery
option and the multiple sampling options or filters.
The following scenarios explain the data domain discovery results when you choose the exclude null values option along with a sampling option and filters:
  • With
    All rows
    as the sampling option and no filters. Data domain discovery ignores all the null values in the column.
  • With a sampling option and no filters. Data domain discovery ignores all the null values in the sampled data and runs on the rest of the sampled data.
  • With
    All rows
    as the sampling option and with filters. Data domain discovery ignores all the null values in the filtered data and runs on the rest of the filtered data.
  • With a sampling option and filters. Data domain discovery ignores the null values in the filtered data in the sample and runs on the rest of the filtered data.

Example

You have a data source with 10,000 rows where 3,000 rows have Social Security Numbers in the Comments column. You create a column profile and data domain discovery and choose the following options:
  • Select the
    Exclude null values from data domain discovery
    option.
  • Select
    All rows
    as the sampling option.
  • Select the
    Minimum percentage of rows
    option and configure the option to 12%.
When you run the profile, the profile runs on the data set and ignores the null values for data domain discovery.

0 COMMENTS

We’d like to hear from you!