目录

Search

  1. Data Discovery 简介
  2. 使用 Informatica Analyst 的 Data Discovery
  3. 使用 Informatica Developer 执行数据发现

Data Discovery Guide

Data Discovery Guide

Column Profile Concepts OverviewProfiles Overview

Column Profile Concepts Overview
Profiles Overview

A column profile determines the characteristics of columns in a data source, such as value frequency, percentages, and patterns.
Column profiling discovers the following facts about data:
  • The number of null, distinct, and non-distinct values in each column, expressed as a number and a percentage.
  • The patterns of data in each column and the frequencies with which these values occur.
  • Statistics about the column values, such as the maximum and minimum lengths of values and the first and last values in each column.
  • Documented data types, inferred data types, and possible conflicts between the documented and inferred data types.
  • Pattern and value frequency outliers.
You can configure the following options when you create or edit a profile:
  • Column profile options. You can select the columns on which you want to run a profile, choose a sampling option, and drill-down option.
  • Add, edit, or delete filters and rules.
In the profile results, you can add comments and tags to a profile and to the columns in a profile. You can assign business terms to columns.
The Model repository locks profiles to prevent users from overwriting work with the repository profile locks. The version control system saves multiple versions of a profile and assigns a version number to each version. You can check out a profile and then check the profile in after making changes. You can undo the action of checking out a profile before you check the profile back in.
Create scorecards to periodically review data quality. You create scorecards before and after you apply rules to profiles so that you can view a graphical representation of the valid values for columns.
Use the Scheduler Service to schedule profiles and scorecards to run at a specific time or intervals. The Scheduler Service manages schedules for profiles, scorecards, deployed mappings and deployed workflows. You can create, manage, and run schedules in Informatica Administrator.
You can configure the Data Integration Service to use operating system profiles. After you configure, the Data Integration Service runs the profiles and scorecards with the permission of the operating system user you define in the operating system profile. You can select the operating system profile in the Analyst tool and the Developer tool.