Table of Contents

Search

  1. Preface
  2. Introduction to Data Validation Option
  3. New Features and Behavior Changes
  4. Repositories
  5. XML Data Source
  6. Tests for XML Data Sources
  7. Connections
  8. Expressions
  9. Table Pairs
  10. Tests for Table Pairs
  11. Single-Table Constraints
  12. Tests for Single-Table Constraints
  13. Examples of Tests from Spreadsheets
  14. SQL Views
  15. Lookup Views
  16. Join Views
  17. Aggregate Views
  18. Business Intelligence and Reporting Tools Reports
  19. Dashboards
  20. DVOCmd Command Line Program
  21. Troubleshooting
  22. Datatype Reference
  23. Reporting Views
  24. Metadata Import Syntax
  25. Jasper Reports
  26. Glossary

Data Validation Option User Guide

Data Validation Option User Guide

Data Validation Tests

Data Validation Tests

After you create a table object, you can create one or more tests to validate the data. You can configure test functions and conditions when you create a test. The type of test that you can create depends on the table object that you use. For example, you can test for null values on a single table object, but not on a table pair object.
When you create a test, you choose the type of test that you want to perform and the type of operation, such as equals or does not equal. If you want to set a threshold for a margin of error, you can choose the approximate operator. You can also configure functionality such as case sensitivity or an allowance for the maximum number of bad records. You can create multiple tests on a table object. When you run a test on a table object, you run all tests that you created on that object.
You can create the following types of data validation tests:
Aggregate tests
Retrieves summarized information about data contained in the data source. Use an aggregate test to verify whether all records were moved or to identify incorrect logic in WHERE clauses.
An aggregate test can detect the following problems:
  • Lack of referential integrity in the source
  • Rejected rows by the target system
  • Incorrect logic in the mapping WHERE clauses
The following table describes the aggregate test functions:
Aggregate Test Functions
Description
Count
Counts non-null values.
Count Distinct
Counts distinct non-null values.
Count Rows
Counts all values including nulls.
Min
Calculates the minimum value for a field.
Max
Calculates the maximum value for a field.
Average
Calculates the average of a numeric field.
Sum
Calculates the total of a numeric field.
Perform aggregate functions in the following order:
  1. COUNT and COUNT ROWS to count the number of rows
  2. SUM to sum numeric fields
  3. COUNT DISTINCT to compare detail and aggregate tables
For example, the target data for daily sales contains shows only total sales of $15,000. You know that your company sells at least 1,000 items a day and that the total sales should be between $100,000 and $200,000. You configure the following test functions:
COUNT_ROWS (any fld) > 1000 SUM (Amount) Between 100000, 200000
Set tests
Compares the distinct values of one field to the distinct values of another field. Determines if the set of values is equal in both data sources or if values are missing in one data source. You might use a set test to identify records in a fact table that are not in the dimension table. You can check target tables to find a lack of referential integrity in the target, either a child without a parent, or a fact record without a corresponding dimension table.
For example, you configure the fact table to be Table A and you configure the dimension table to be Table B. You configure the test
Set A in B
to perform the validation test. This test will verify that the fact foreign keys are in the parent table. You create an expression that concatenates composite keys, and you run the tests on that expression.
The following table describes the set test functions:
Set Test Functions
Description
Set A in B
Determines whether all distinct values of A are in B.
Set B in A
Determines whether all distinct values of B are in A.
Set A Equal B
Determines whether A and B have the same distinct values.
Set A Not In B
Determines whether A and B share no common values.
Value tests
Evaluates data based on values. Errors in complex transformations can manifest themselves in simple ways such as NULLs in the target, missing rows, or incorrect formats. To test for these types of errors, you can enforce constraints on target tables.
You might use an outer value test type to check the lack of referential integrity in the source and identify orphan records.
The following table describes the value and expression test functions:
Value Test Functions
Description
Value
Tests individual values across data sets. For example,
Value (FldA) Between 10,50
.
Outer Value
Compares values across tables row-by-row to identify orphans across data sets.
Not Null
Determines whether values in fields are not null. For example,
NOT_NULL (FldD)
.
Not Blank
Determines whether values in fields are not blank.
Unique
Determines whether values in fields are unique. For example,
UNIQUE (PK)
.
Format
Determines whether the values in the field match a pattern or constraint value.