Step 2: Configuring the Oracle source

In the

Connection

list, select the connection for the Oracle source that you configured and tested in

Administrator

for a runtime environment that your organization uses. The connection type appears in parentheses after the connection name.

The list includes only the connection types that are valid for the load type selected on the

Definition

page. No connections are listed if you did not select a load type.

After you deploy the ingestion task, you cannot change the connection without first undeploying the associated ingestion job. After you change the connection, you must deploy the task again.

In the

Schema

list, select the source schema that includes the source tables.

When creating a task that has a Oracle source, the schema name that is specified in the connection properties is displayed by default.

Under

Change Data Capture Method

, set the

CDC Method

field to

Log-based CDC

.

In this CDC method, Inserts, Updates, Deletes, and column DDL changes are captured from the Oracle redo logs.

Under

Table Selection

, select

Select All

to select all tables and columns in the schema for data replication. The

Tables Selected

field shows the count of all selected tables. You won't be able to edit the selection later under

Table View

.

Fetching information for all of the tables might take a long time.

Alternatively, you can use rules to define a subset of source tables to replicate.

To use rules to select the source tables, make sure that the

Select All

check box is cleared and then add rules.

When rule-based selection is used, you can refine the set of selected tables by table under

Table View

and also set an option for trimming spaces in character data.

The default "Include *" rule selects all tables in the specified schema. To see how many tables are selected by this rule, click the Refresh icon to display the table count in

Total Tables Selected

and click

Apply Rules

to see the table count in

Table View

.

To add a rule:

Under

Rules

, click the Add Rule (+) icon above the first table. A row is added to the table.

In the

Table Rule

column, select

Include

or

Exclude

to create an inclusion or exclusion rule, respectively.

In the

Condition

column, enter a table name or a table-name mask that includes one or more wildcards to identify the source tables to include in or exclude from table selection. Use the following guidelines:

A mask can contain one or both of the following wildcards: an asterisk (*) wildcard to represent one or more characters and a question mark (?) wildcard to represent a single character. A wildcard can occur multiple times in a mask value and can occur anywhere in the value.

The task wizard is case sensitive. Enter the table names or masks in the case with which the tables were defined.

Do not include delimiters such as quotation marks or brackets, even if the source database uses them. For example, Oracle requires quotation marks around lowercase and mixed-case names to force the names to be stored in lowercase or mixed case. However, in the task wizard, you must enter the lowercase or mixed-case names without quotation marks.

If a table name includes special characters such as a backslash (\), asterisk(*), dollar sign ($), caret (^), or question mark (?), escape each special character with a backslash (\) when you enter the rule.

Define additional rules as needed.

The rules will be processed in the order in which they're listed, from top to bottom. Use the arrow icons to change the order.

When finished, click

Apply Rules

.

The

Total Tables Selected

and

Tables Affected

counts are updated.

The following image shows multiple rules defined on the

Source

page:

After you apply rules, if you add, delete, or change rules, you must click

Apply Rules

again. Click the Refresh icon to update the table counts. If you delete all rules without clicking

Apply Rules

, a validation error will occur at deployment, even if the

Table View

list still lists tables. If you switch to

Select All

, the rules are no longer in effect and disappear.

To perform trim actions on character columns in the source tables selected based on rules, create column action rules.

To create a column action rule:

Click the Add Rule (+) icon above the second table.

In the

Action

column, select one of the following options:

LTRIM

. Trims spaces to the left of character column values.

RTRIM

. Trims spaces to the right of character column values.

TRIM

. Trims spaces to the left of and to the right of character column values.

In the

Condition

column, enter a column name or a column name mask that includes one or more asterisk (*) or question mark (?) wildcards. The value is matched against columns in the selected source tables to identify the columns to which the action applies.

You can define multiple rules for different action types or for the same action type with different conditions. The rules are processed in the order in which they're listed, from top to bottom. Use the arrow icons to change the order.

Under

Table View

, view or edit the set of selected source tables and columns.

If you selected

Select All

, the lists of tables and columns are view only.

If you applied rules, you can refine the set of selected tables by clicking the check box next to individual tables. Deselect any tables that you do not want to replicate, or select additional items to replicate. Click the Refresh icon to update the selected tables count.

You can also individually deselect or reselect the columns in a selected source table. To view or change the columns from which data will be replicated for a selected table, click the highlighted number of columns in the

Columns

column. The column names and data types are displayed to the right. By default, all columns are selected for a selected source table. To deselect a column or reselect it, click the check box next to the column name. You cannot deselect a primary key column.

The following image shows selected tables and the selected columns for the first table:

Notes:

To search for tables or columns or both, in the drop-down list above the columns list, select

Table Name

,

Columns

, or

All

and then enter a search string in the Find
box and click

Search

. You can include a single asterisk (*) wildcard at the beginning or end of the string.

The first time you change a check box setting for a table or column, the rules are no longer in effect. The selections under

Table View

take precedence. However, if you click the Add Rule (+) icon again, any tables that you deselected or selected individually are reflected as new rules in the Rules list and the rules once again take precedence. If you want to return to the

Table View

list, click

Apply Rules

again.

In the

Table View

section, the length of character columns is shown in bytes. Since the number of bytes per character differs depending on the character set encoding used by the database, for some sources, the actual column length in number of characters might be different than the number of bytes shown in the

Table View

.

If you select columns individually, the resulting set of columns is fixed and will not be updated by any schema change, regardless of schema drift settings. For example, if a source column is added or renamed, that column is silently excluded from CDC processing because it's not in the list of selected columns. However, if a selected column is dropped on the source, the schema drift Drop Column option controls how it's handled. The dropped column operation is not reflected in the list of columns until you apply rules again.

If one or more of the selected source tables are not enabled for change data capture, you can generate a script for enabling CDC and then run or download the script.

In the

CDC Script

field, select one of the following options:

Enable CDC for all columns

. Enables CDC for all columns in the selected source tables.

For source tables without a primary key, including any tables with unique indexes, CDC is enabled for all columns by default, regardless of which option is selected.

Enable CDC for primary key columns

. Enables CDC only for primary key columns in the selected source tables.

The script enables supplemental logging for all or primary key columns in the selected source tables to log additional information in the redo logs.

To run the script, click

Execute

.

If you do not have a database role or privilege that allows you to run the script, click the Download icon to download the script. The script file name has the following format:

cdc_script_taskname_number.txt

. Then ask your database administrator to run the script.

Make sure the script runs before you run the database ingestion task.

If you change to the

CDC Script

option later and run the script again, the script first drops CDC for the original set of columns and then enables CDC for the current set of columns.

To create and download a list of the source tables that match the table selection criteria, perform the following substeps:

If you used rule-based table selection, in the

List Tables by Rule Type

list, select the type of selection rules that you want to use. Options are:

Include Rules Only

Exclude Rules Only

Include And Exclude Rules

To include columns in the list, regardless of which table selection method you used, select the

Include Columns

check box.

Click the Download icon.

A downloaded list that includes columns has the following format:

status,schema_name,table_name,object_type,column_name,comment

The following table describes the information that is displayed in the downloaded list:

Field	Description
status	Indicates whether Mass Ingestion Databases excludes the source table or column from processing because it has an unsupported type. Valid values are: E . The object is excluded from processing by an Exclude rule. I . The object is included in processing. X . The object is excluded from processing because it is an unsupported type of object. For example, unsupported types of objects include columns with unsupported data types and tables that include only unsupported columns. The comment field provides detail on unsupported types.
schema_name	Specifies the name of the source schema.
table_name	Specifies the name of the source table.
object_type	Specifies the type of the source object. Valid values are: C . Column. T . Table.
column_name	Specifies the name of the source column. This information appears only if you selected the Columns check box.
comment	Specifies the reason why a source object of an unsupported type is excluded from processing even though it matches the selection rules.

Under

Advanced

, set the advanced properties that are available for combined initial and incremental load jobs with Oracle sources.

Property	Description
Enable Persistent Storage	Select this check box to enable persistent storage of transaction data in a disk buffer so that the data can be consumed continually, even when the writing of data to the target is slow or delayed. Benefits of using persistent storage are faster consumption of the source transaction logs, less reliance on log archives or backups, and the ability to still access the data persisted in disk storage after restarting a database ingestion job.
Enable Partitioning	Select this check box to enable partitioning of source objects. When an object is partitioned, the database ingestion job processes the records read from each partition in parallel. For Oracle sources, Mass Ingestion Databases determines the range of partitions by using the ROWID as the partition key. In combined initial and incremental loads, the partitioning of source objects occurs only in the initial load phase.
Number of Partitions	If you enable partitioning of source objects, enter the number of partitions you want to create. The default number is 5. The minimum value is 2.
Initial Start Point for Incremental Load	Set this field if you want to customize the position in the source logs from which the database ingestion job starts reading change records the first time it runs. Options are: Latest Available . The latest available position in the database log or structure. Position . A valid Oracle SCN that is used to determine a position in the change stream from which to start retrieving change records. This value must be equal to or less than the current position value. An invalid value will cause the job to fail. The default value is 0, which causes processing to start from the latest available position. Specific Date and Time . A date and time, in the format MM/DD/YYYY hh:mm AM\|PM , that Mass Ingestion Databases uses to determine the position in the change stream from which to start retrieving change records. Mass Ingestion Databases retrieves only the changes that were started after this date and time. If you enter a date and time earlier than the earliest date and time in the available archived logs, the job will fail. The default is Latest Available . The Initial Start Point for Incremental Load option pertains only to the initial run of a job. Thereafter, if you resume a stopped or aborted job, the job begins propagating source data from where it last left off. For combined initial and incremental load jobs, initial loading is not performed until the incremental processing of change data reaches the end of the current transaction log. For this reason, Informatica recommends that you select Latest Available as the start point. For Oracle combined initial and incremental load jobs, Oracle Flashback queries are used to get committed data that was current at a specific point in the change stream. Ensure that no source table is truncated during the initial load period. If truncation occurs, any DDL change performed during a flashback query causes the query to fail.

Under

Custom Properties

, you can specify custom properties at the direction of Informatica Global Customer Support to meet your special requirements. In this scenario, skip this step.

Click

Table of Contents

Implementing Oracle to Snowflake Synchronization Using Cloud Mass Ingestion Databases

Implementing Oracle to Snowflake Synchronization Using Cloud Mass Ingestion Databases