When you create or edit a profile in the Analyst tool, you can select the run-time environment.
In the
Discovery Home
panel, click
Data Object Profile
or select
New
Data Object profile
from anywhere in the Analyst tool.
The
New Profile
wizard appears. The
Column profiling
option is selected by default.
Click
Next
.
In the
Sources
pane, select a data object.
Click
Next
.
Enter a name and an optional description for the profile.
In the
Folders
pane, select the project or folder where you want to create the profile.
The Analyst tool displays the project that you selected and shared projects that contain folders where you can create the profile. The profiles in the folder appear in the right pane.
Click
Next
.
In the
Columns
pane, select the columns that you want to run a profile on. The columns include any rules that you applied to the profile. The Analyst tool lists column properties, such as the name, data type, precision, and scale for each column.
Optionally, select
Name
to select all columns.
In the
Sampling Options
pane, configure the sampling options.
In the
Drilldown Options
pane, configure the drill-down options.
Optionally, click
Select Columns
to select columns to drill down on. In the
Drilldown columns
dialog box, select the columns for drilldown and click
OK
.
Accept the default option in the
Profile Results Option
pane.
The first time you run the profile, the Analyst tool displays profile results for all columns selected for profiling.
Click
Next
.
Optionally, define a filter for the profile.
Click
Next
to verify the row drill-down settings including the preview columns for drilldown.
To run the profile in the Hadoop environment, select
Hive
and then select a hive connection. The Hive connection helps the Data Integration Service communicate with the Hadoop cluster to push down the profile execution from the Data Integration Service to the Hadoop cluster.