Table of Contents

Search

  1. Introduction to Data Discovery
  2. Data Discovery with Informatica Analyst
  3. Data Discovery with Informatica Developer
  4. Function Support Based on Profiling Warehouse Connection

Data Discovery Guide

Data Discovery Guide

Column Profile Sampling Options for Enterprise Discovery

Column Profile Sampling Options for Enterprise Discovery

The sampling options determine whether the Developer tool runs a column profile on all rows of the data sources or limited number of rows.
The following table describes the column profile sampling options that you configure for enterprise discovery:
Option
Description
All Rows
Runs a profile on all the rows in the data object.
Supported on Native, Blaze, and Spark run-time environment.
Sample First <number> rows
Runs a profile on the sample rows from the beginning of the rows in the data object. You can choose a maximum of 2,147,483,647 rows.
Supported on Native and Blaze run-time environment.
Limit N <number> rows
Runs a profile based on the number of rows in the data object. When you choose to run a profile in the Hadoop validation environment, Spark engine collects samples from multiple partitions of the data object and pushes the samples to a single node to compute sample size. The Limit n sampling option supports Oracle, SQL Server, and DB2 databases. You cannot apply the Advanced filter with the Limit n sampling option.
Supported on Spark run-time environment.
Random Percentage
Runs a profile on a percentage of rows in the data object.
Supported on Spark run-time environment.
Exclude data type inference for columns with an approved data type
Excludes columns with an approved data type from the data type inference of the column profile run.

0 COMMENTS

We’d like to hear from you!