Table of Contents

Search

  1. Preface
  2. Data integration tasks
  3. Mapping tasks
  4. Data transfer tasks
  5. Data loader tasks

Tasks

Tasks

Guidelines for customizing your source

Guidelines for customizing your source

To increase performance by removing unnecessary data, by default
Data Integration
adds only the source objects you include in the source location.
Use the following guidelines when customizing your source:
Remove unnecessary data from the data flow.
When you create a task, it's best to remove unnecessary source objects, fields, and records from the data flow. Removing unnecessary data decreases the time it takes to run a task and helps to minimize rejected records.
You can choose which source objects to read on the
Connect Source
page. You can also select the fields to exclude and configure filters to exclude unnecessary records.
To prevent duplicate target rows, configure primary key fields.
Primary key fields uniquely identify records in the source and target objects. When you re-run a task, the task uses the primary key fields so that it can update existing rows and insert new rows into the target tables. If the source objects don't have primary key fields defined, the task inserts rows into the target tables, but it cannot update existing rows, which can lead to duplicate rows in the target tables.
Data loader
tasks can automatically detect primary key fields for most connection types. You can also select the primary key fields manually. Configure primary key field options on the
Connect Source
page.
To process only new and changed data, configure watermark fields.
Watermark fields are date/time or numeric fields that identify which records were added or changed. If the source objects don't have watermark fields defined, the task must process all records in the source objects each time the task runs, which increases the task processing time.
Data loader
tasks can automatically detect watermark fields for most connection types. You can also select the watermark fields manually. Configure watermark field options on the
Connect Source
page.
The following table lists the possible primary key and watermark field configurations and the expected results:
Primary key and watermark field configuration
Result
Primary key fields configured.
Watermark fields configured.
Changed records updated, new records inserted into the target tables (upsert).
Recommended configuration for best performance.
Primary key fields configured.
Watermark fields not required.
Changed records updated, new records inserted into the target tables (upsert).
Impacts task performance because the task must perform a full scan on the source.
Primary key fields not required.
Watermark fields configured.
All records inserted into the target tables. Records that already exist are duplicated.
Primary key fields not required.
Watermark fields not required.
All records inserted into the target tables. Records that already exist are duplicated.
Impacts task performance because the task must perform a full scan on the source. This is the least recommended configuration.

0 COMMENTS

We’d like to hear from you!