Table of Contents

Search

  1. Preface
  2. Database Ingestion and Replication

Database Ingestion and Replication

Database Ingestion and Replication

Running data validation for a database ingestion and replication jobs

Running data validation for a
database ingestion and replication
jobs

For initial load jobs that completed successfully, you can run data validation to compare the source and target data. Data validation is available only for initial load jobs that have an Oracle or a SQL Server source and a Snowflake target.
  • The availability of the data validation feature is controlled by an organization-level feature flag. If this functionality is not available for your organization but you want to use it, contact Informatica Global Customer Support.
  • When you run data validation for
    Database Ingestion and Replication
    , you will be charged per the CPU consumption on the
    Data Validation
    service side.
  • The source and target connections defined in the task for which you want to run the data validation must be on the same Secure Agent. You must enable the
    Data Validation
    service on the Secure Agent.
  • The source and target schemas specified in the task definition must be the same as the schemas used in the source and target connection properties.
  • In the Snowflake Data Cloud connection properties, enter the database and schema name in the
    Additional JDBC URL Parameters
    field in the following format:
    db=
    <database_name>
    &schema=
    <schema_name>
  • For data validation to run successfully, the source table and column names cannot contain any special characters. Otherwise, data validation fails.
  • To prevent false alarms that result from validating unsupported data types, you can exclude these data types by using the datavalidation.datatypes.skip custom property. On the
    Schedule and Runtime Options
    page of the task wizard, enter datavalidation.datatypes.skip as the property name and a comma-separated list of data types as the property value.
  1. To display the job details, drill down on a job from the
    My Jobs
    page in the
    Data Integration
    service, the
    All Jobs
    page in the
    Monitor
    service, or from the
    Data Ingestion and Replication
    page in
    Operational Insights
    service.
  2. On the
    Object Detail
    pane, navigate to the subtask row for which you want to run data validation. In the Actions menu for the row, select
    Run Data Validation
    .
    For the
    Run Data Validation
    option to be available, the task must have the status of
    Completed
    .
  3. Configure how the data should be validated:
    1. Select the Flat file connection.
      This connection will be used to store the data validation results.
      The Flat file connection and the
      database ingestion and replication
      job must be on the same runtime environment.
    2. In the
      Sample
      field, select the option for sampling the size of the data for comparison. The default value is
      Last 1000 Rows
      .
  4. Click
    Run
    .
    The data validation process starts. The
    Data Validation
    column in the
    Object Detail
    pane shows the data validation status for the selected task.
    If data validation processing completes successfully, you can click the
    Success
    status to view the Data Validation Summary. The summary contains the results of the row count validation and the cell-to-cell comparison.
    To download a detailed data validation report, click the Download icon. The report highlights any missing or modified rows and columns based on a comparison of the source and target tables.
    If an error occurred during the data validation processing, click the Download icon next to the
    Error
    status to view the error message.

0 COMMENTS

We’d like to hear from you!