Table of Contents

Search

  1. Preface
  2. Data Profiling
  3. Profiles
  4. Profile results
  5. Tuning data profiling task performance
  6. Troubleshooting

Data Profiling

Data Profiling

Filters

Filters

You can use filters to select the values that a profile can read in a column of source data. You can create filters based on the simple and query filter types.
When you add a filter to a data column, the profile runs only on the data values that meet the filter criteria that you specify. You can add, delete, or update the filters in subsequent runs. After you add the filters, you can choose the filter that you want for the next profile run.
When you delete a column from the source object, any filter on the column is deleted from the profile during the profile run. When a filter applies to more than one column and you delete one of the columns,
Data Profiling
ignores the filter or filter condition that uses the deleted column during the profile run.
You can create the following types of filter:
Simple filter
When you create a simple conditional filter, you can select operators such as Equals, Less Than, Less Than or Equals, Greater Than, Greater Than or Equals, Not Equals, Is Null, and Is Not Null.
For example, you are a data analyst and you created a profile on a Sales table. You want to extract the sales details for New York and share it with the business team. To accomplish this task, you create a filter with the filter condition
City = New York
and add it to the profile. You run the profile and export the profile results to share with the business team.
You can also create dynamic filters for relational data sources to filter the date and timestamp columns. The dynamic filter includes options such as Today, Tomorrow, Yesterday, Next Week, Next Month, and Custom.
For example, assume that you want to profile the sales orders that were created last month, and run the profile every month. To accomplish this task, you create a filter with the dynamic filter condition
COLUMN_DATE = Last Month
and add it to the profile. By doing this, you need not change the filter condition every month and
Data Profiling
resolves the right date at the runtime when the profiling task runs.
The following image displays a sample of the simple dynamic filter: The image displays an example of the simple dynamic filter.
Query filter
You can define a custom SQL query to apply a complex filter condition to the column data. You can create an SQL filter for relational data sources such as Oracle, Amazon Redshift, and Snowflake. You must enter the SQL query with just the WHERE clause, but not the entire query statement.
You can enter the SQL query starting with the query condition as shown in the following example,
Id IN (SELECT Id FROM TABLE_2 WHERE Id > '35') AND City='Chicago'
.
Test the SQL statement you want to use as a filter condition before you create a saved query.
Data Profiling
does not display specific error messages for invalid SQL statements.
To filter Google BigQuery source objects, use the SQL Override Query in the advanced options.

0 COMMENTS

We’d like to hear from you!