Table of Contents

Search

  1. Preface
  2. Data Profiling
  3. Profiles
  4. Profile results
  5. Tuning data profiling task performance
  6. Troubleshooting

Data Profiling

Data Profiling

Queries

Queries

After you run a profile on a source object, you can create and run one or more queries on the source object.
You can create and run queries if you select the
Drilldown
option for the current profile run. You need the Query - Create privilege to create queries, and the Query - Submit privilege to run queries and view query results.

Create a query

On the
Results
tab, you can create one or more queries to retrieve the rows from a profiled source object that has a data quality problem. You can query based on field or column values, inferred patterns, data types, and rule outputs. For example, you can create and run a query that can retrieve source rows that are 'Invalid', where 'Invalid' is a business rule that you define in a rule specification, or if the postal code pattern is not 9(5).
You can add one or more query conditions to a query. The following table shows the attributes that you use to create a condition:
Attribute
Description
Columns
Choose a column. You can select columns in the source object and rule outputs in the current profile run.
Columns and rule outputs might not appear in the list of columns if the data type of a column and rule output is not supported by
Data Profiling
.
Operator
Choose an operator to filter the results.
You can select Equals, Not Equals, Less Than, Less Than or Equals, Greater Than, Greater Than or Equals, Between, In, Not In, Is Null, Is Not Null, Patterns, Data Types, Starts With, Ends With, or Contains operator for a condition.
When you select the Patterns operator,
Data Profiling
shows the inferred patterns for the current profile run. When you select Data Types operator,
Data Profiling
shows the documented data type and inferred data types in the current profile run.
Data Profiling
does not show any inferred pattern if you select a column that is not included in the latest profile run. In this case, you can enter a pattern.
Values
Enter the values as necessary.
When you choose the Patterns or Data Types operator, you can select one or more patterns or data types as values.

Run a query

You can run more than one query at a time. To run the queries, choose a flat file connection.
Data Profiling
runs the queries on the runtime environment that you chose for the flat file connection. When you use a flat file connection to create and run a profile on a flat file source,
Data Profiling
shows the flat file connections that use the same runtime environment that was used in the profile's flat file connection. You can create a dedicated flat file connection to run and save queries.
Data Profiling
creates a job when you run a query. You can monitor the job progress on the
My Jobs
page. You can also monitor the job progress in
Monitor
and
Operational Insights
.
The query runs on all the rows in the source object. If you chose a filter for the profile run or choose a filter and then create a query,
Data Profiling
filters the source object and then runs the query on the filtered results.

View query results

You can view the query results in the
Data Preview
area. When you run the query,
Data Profiling
generates a query results file named
query_<ProfileName>query<QueryName>.csv
. If the profile has associated rules,
Data Profiling
also generates a legend file named
query_<ProfileName>query<QueryName>.legend
which explains the column content in the query results file.
Data Profiling
saves the files in the directory that you specified in the flat file connection.
Data Profiling
. When you run a query multiple times, the query results are overwritten in the file.

Delete a query

When you delete a query,
Data Profiling
deletes the query from the profile. It does not delete the query results file and legend file related to the query. You can maintain, secure, and delete the files as required.

Example

You are a data analyst. You run a profile on the Order table, and you notice that the OrderID column has data types and patterns that are not valid. You want to generate a query to extract these specific results to analyze them. To accomplish this task, you complete the following steps:
  1. On the
    Results
    page, you create a query to meet one of the following conditions:
    1. You choose the Patterns operator for the OrderID column and then select the inferred patterns that are invalid.
    2. You choose the Data Types operator for the OrderID column and then select the inferred data types that are invalid.
  2. You save and run the query.
    The complete query results appear in the
    Data Preview
    area.
  3. Alternatively, to view the complete query results, you navigate to the query results file location to analyze the results.

0 COMMENTS

We’d like to hear from you!