In basic mode, a rule statement is a set of operators, conditions, and actions that analyze a column of data and generate an output based on the result of the analysis. You add a rule statement to a rule set.
A condition is a data operation that determines a single fact about a data value. You can add multiple conditions to a rule statement. An action is a data operation that generates a potential output from the rule set. An action generates data when the input that you add to the rule statement satisfies the conditions that you define.
A rule specification reads the rule statements in a rule set from top to bottom. For a given row of input data, the rule set accepts the output from the first rule statement that generates output data.
Each rule set contains a system-defined rule statement that specifies the action to take if no other rule statement generates output data. The system-defined rule statement is the final rule statement in the rule set. You can edit the action in the system-defined rule statement. By default, the rule statement specifies that the rule set does not generate any output data if the other rule statements do not generate output data.
In advanced mode, you can write the equivalent rule in expression logic.
Status values in rule statements
The actions that a rule statement can perform include the generation of status values. A status value is a predefined value that a rule statement can generate as the output from an action. You can configure an action to generate a status value as an alternative to user-defined value.
A rule statement can return
Valid
or
Invalid
as status values. You cannot modify the status values.
You can use status values to achieve the following objectives:
Provide data to profiles
Scorecards in Data Profiling can recognize status values. When a rule specification returns a status value as an output, the scorecard can report the number of status values as a data quality category. To enable the scorecard to read the rule specification output, add the rule specification as a rule to the profile that generates the scorecard.
Provide information to downstream users about exception records
You can configure a rule specification to identify a record as an exception. An exception is a record that contains unresolved data quality issues.
To identify records as exceptions in basic mode, configure a rule statement to return the status value
Invalid
. Define the exception properties on a rule statement in the primary rule set.
To identify records as exceptions in advanced mode, configure exception indicators for one or more status values that you define in the rule logic that you write. In advanced mode, you can define custom status values that more closely describe the data quality issue in each exception record. You can associate each status value with a corresponding data quality issue. You can configure a different set of exception properties for each status value.