Database Ingestion and Replication

Back Next

Running data validation for a database ingestion and replication jobs

Running data validation for a
database ingestion and replication
jobs

For initial load jobs that completed successfully, you can run data validation to compare the source and target data. Data validation is available only for initial load jobs that have an Oracle or a SQL Server source and a Snowflake target.

The availability of the data validation feature is controlled by an organization-level feature flag. If this functionality is not available for your organization but you want to use it, contact Informatica Global Customer Support.

When you run data validation for

Database Ingestion and Replication

, you will be charged per the CPU consumption on the

Data Validation

service side.

The source and target connections defined in the task for which you want to run the data validation must be on the same Secure Agent. You must enable the

Data Validation

service on the Secure Agent.

The source and target schemas specified in the task definition must be the same as the schemas used in the source and target connection properties.

In the Snowflake Data Cloud connection properties, enter the database and schema name in the

Additional JDBC URL Parameters

field in the following format:

db=<database_name>&schema=<schema_name>

For data validation to run successfully, the source table and column names cannot contain any special characters. Otherwise, data validation fails.

To prevent false alarms that result from validating unsupported data types, you can exclude these data types by using the datavalidation.datatypes.skip custom property. On the

Schedule and Runtime Options

page of the task wizard, enter datavalidation.datatypes.skip as the property name and a comma-separated list of data types as the property value.

To display the job details, drill down on a job from the

My Jobs

page in the

Data Integration

service, the

All Jobs

page in the

Monitor

service, or from the

Data Ingestion and Replication

page in

Operational Insights

service.

On the

Object Detail

pane, navigate to the subtask row for which you want to run data validation. In the Actions menu for the row, select

Run Data Validation

For the

Run Data Validation

option to be available, the task must have the status of

Completed

Configure how the data should be validated:

Select the Flat file connection.

This connection will be used to store the data validation results.

The Flat file connection and the

database ingestion and replication

job must be on the same runtime environment.

In the

Sample

field, select the option for sampling the size of the data for comparison. The default value is

Last 1000 Rows

Click

Run

The data validation process starts. The

Data Validation

column in the

Object Detail

pane shows the data validation status for the selected task.

If data validation processing completes successfully, you can click the

Success

status to view the Data Validation Summary. The summary contains the results of the row count validation and the cell-to-cell comparison.

To download a detailed data validation report, click the Download icon. The report highlights any missing or modified rows and columns based on a comparison of the source and target tables.

If an error occurred during the data validation processing, click the Download icon next to the

Error

status to view the error message.

Managing database ingestion and replication jobs

Download Guide

Watch

Comments

Cloud Mass Ingestion Homepage