Hi, I'm Ask INFA!
What would you like to know?
ASK INFAPreview
Please to access Ask INFA.

Ingesting Db2 for i Change Data into Snowflake with Cloud Mass Ingestion Databases

Ingesting Db2 for i Change Data into Snowflake with Cloud Mass Ingestion Databases

Step 2: Configuring the Db2 for i source

Step 2: Configuring the Db2 for i source

Configure the Db2 for i source on the
Source
page of the
database ingestion
task wizard.
  1. In the
    Connection
    list, select the connection for the Db2 for i source that you configured and tested in
    Administrator
    for a runtime environment that your organization uses. The connection type appears in parentheses after the connection name.
    The list includes only the connection types that are valid for the load type selected on the
    Definition
    page. No connections are listed if you did not select a load type.
    After you deploy the ingestion task, you cannot change the connection without first undeploying the associated ingestion job. After you change the connection, you must deploy the task again.
  2. In the
    Schema
    list, select the source schema that includes the source tables.
    The list includes only the schemas that are available in the database accessed with the specified source connection.
  3. In the
    Journal Name
    field, select the name of the journal that records the changes made to the source tables.
  4. Under
    Table Selection
    , select
    Select All
    only if you want to select all tables and columns in the schema for data replication. You cannot edit the selection in subsequent fields.
    The
    Tables Selected
    field shows the count of all selected tables. The interface fetches information about all of the tables.
    Fetching information for all of the tables might take a long time.
    The following image shows the
    Source
    page when
    Select All
    is selected:
    Alternatively, you can use rules to define a subset of source tables to replicate.
  5. To use rules to select the source tables, make sure that the
    Select All
    check box is cleared and then add rules.
    When rule-based selection is used, you can refine the set of selected tables by table under
    Table View
    and also set an option for trimming spaces in character data.
    The default "Include *" rule selects all tables in the specified schema. To see how many tables are selected by this rule, click the Refresh icon to display the table count in
    Total Tables Selected
    and click
    Apply Rules
    to see the table count in
    Table View
    .
    To add a rule:
    1. Under
      Rules
      , click the Add Rule (+) icon above the first table. A row is added to the table.
    2. In the
      Table Rule
      column, select
      Include
      or
      Exclude
      to create an inclusion or exclusion rule, respectively.
    3. In the
      Condition
      column, enter a table name or a table-name mask that includes one or more wildcards to identify the source tables to include in or exclude from table selection. Use the following guidelines:
      • A mask can contain one or both of the following wildcards: an asterisk (*) wildcard to represent one or more characters and a question mark (?) wildcard to represent a single character. A wildcard can occur multiple times in a mask value and can occur anywhere in the value.
      • The task wizard is case sensitive. Enter the table names or masks in the case with which the tables were defined.
      • Do not include delimiters such as quotation marks or brackets, even if the source database uses them.
      • If a table name includes special characters such as a backslash (\), asterisk(*), dollar sign ($), caret (^), or question mark (?), escape each special character with a backslash (\) when you enter the rule.
    4. Define additional rules as needed.
      The rules will be processed in the order in which they're listed, from top to bottom. Use the arrow icons to change the order.
    5. When finished, click
      Apply Rules
      .
      The
      Total Tables Selected
      and
      Tables Affected
      counts are updated.
      The following image shows multiple rules defined on the
      Source
      page:
      After you apply rules, if you add, delete, or change rules, you must click
      Apply Rules
      again. Click the Refresh icon to update the table counts. If you delete all rules without clicking
      Apply Rules
      , a validation error will occur at deployment, even if the
      Table View
      list still lists tables. If you switch to
      Select All
      , the rules are no longer in effect and disappear.
  6. To perform trim actions on character columns in the source tables selected based on rules, create column action rules.
    To create a column action rule:
    1. Click the Add Rule (+) icon above the second Action table under
      Rules
      .
    2. In the
      Action
      column, select one of the following options:
      • LTRIM
        . Trims spaces to the left of character column values.
      • RTRIM
        . Trims spaces to the right of character column values.
      • TRIM
        . Trims spaces to the left of and to the right of character column values.
    3. In the
      Condition
      column, enter a column name or a column name mask that includes one or more asterisk (*) or question mark (?) wildcards. The value is matched against columns in the selected source tables to identify the columns to which the action applies.
    You can define multiple rules for different action types or for the same action type with different conditions. The rules are processed in the order in which they're listed, from top to bottom. Use the arrow icons to change the order.
  7. Under
    Table View
    , view or edit the set of selected source tables and columns.
    If you selected
    Select All
    , the lists of tables and columns are view only.
    If you applied rules, you can refine the set of selected tables by clicking the check box next to individual tables. Deselect any tables that you do not want to replicate, or select additional items to replicate. Click the Refresh icon to update the selected tables count.
    The following image shows selected tables:
    Notes:
    • To search for tables or columns or both, in the drop-down list above the columns list, select
      Table Name
      ,
      Columns
      , or
      All
      and then enter a search string in the
      Find
      box and click
      Search
      . You can include a single asterisk (*) wildcard at the beginning or end of the string.
    • The first time you change a check box setting for a table or column, the rules are no longer in effect. The selections under
      Table View
      take precedence. However, if you click the Add Rule (+) icon again, any tables that you deselected or selected individually are reflected as new rules in the Rules list and the rules once again take precedence. If you want to return to the
      Table View
      list, click
      Apply Rules
      again.
    • In the
      Table View
      section, the length of character columns is shown in bytes. Since the number of bytes per character differs depending on the character set encoding used by the database, for some sources, the actual column length in number of characters might be different than the number of bytes shown in the
      Table View
      .
  8. If one or more of the selected source tables are not enabled for change data capture, you can generate a script for enabling CDC and then run or download the script.
    1. In the
      CDC Script
      field, select
      Enable CDC for all columns
      to enable CDC for all columns in the selected source tables. This option is the only available option for Db2 for i sources.
      The script enables journaling on the Db2 for i source tables.
    2. To run the script, click
      Execute
      .
      If you do not have a database role or privilege that allows you to run the script, click the Download icon to download the script. The script file name has the following format:
      cdc_script_
      taskname
      _
      number
      .txt
      . Then ask your database administrator to run the script.
      Make sure the script runs before you run the database ingestion task.
    If you change to the
    CDC Script
    option later and run the script again, the script first drops CDC for the original set of columns and then enables CDC for the current set of columns.
  9. To create and download a list of the source tables that match the table selection criteria, perform the following substeps:
    1. If you used rule-based table selection, in the
      List Tables by Rule Type
      list, select the type of selection rules that you want to use. Options are:
      • Include Rules Only
      • Exclude Rules Only
      • Include And Exclude Rules
    2. To include columns in the list, regardless of which table selection method you used, select the
      Include Columns
      check box.
    3. Click the Download icon.
      A downloaded list that includes columns has the following format:
      status
      ,
      schema_name
      ,
      table_name
      ,
      object_type
      ,
      column_name
      ,
      comment
      The following table describes the information that is displayed in the downloaded list:
      Field
      Description
      status
      Indicates whether Mass Ingestion Databases excludes the source table or column from processing because it has an unsupported type. Valid values are:
      • E
        . The object is excluded from processing by an Exclude rule.
      • I
        . The object is included in processing.
      • X
        . The object is excluded from processing because it is an unsupported type of object. For example, unsupported types of objects include columns with unsupported data types and tables that include only unsupported columns. The comment field provides detail on unsupported types.
      schema_name
      Specifies the name of the source schema.
      table_name
      Specifies the name of the source table.
      object_type
      Specifies the type of the source object. Valid values are:
      • C
        . Column.
      • T
        . Table.
      column_name
      Specifies the name of the source column. This information appears only if you selected the
      Columns
      check box.
      comment
      Specifies the reason why a source object of an unsupported type is excluded from processing even though it matches the selection rules.
  10. Under
    Advanced
    , set the advanced properties that are available for incremental load jobs with Db2 for i sources:
    Property
    Description
    Enable Persistent Storage
    Select this check box to enable persistent storage of transaction data in a disk buffer so that the data can be consumed continually, even when the writing of data to the target is slow or delayed.
    Benefits of using persistent storage are faster consumption of the source transaction logs, less reliance on log archives or backups, and the ability to still access the data persisted in disk storage after restarting a database ingestion job.
    Initial Start Point for Incremental Load
    Set this field if you want to customize the position in the source logs from which the
    database ingestion
    job starts reading change records the first time it runs.
    Options are:
    • Earliest Available
      . The start of the current journal with changes made to the Db2 for i source table.
    • Latest Available
      . The latest available position in the database log.
    • Position
      . A valid timestamp that is used to determine a position in the change stream from which to start retrieving change records. The timestamp must be equal to or less than the current timestamp. An invalid value will cause the job to fail. For Db2 for i sources, do not use the default value of 0.
    • Specific Date and Time
      . A date and time, in the format MM/DD/YYYY
      hh:mm AM|PM
      , that
      Mass Ingestion Databases
      uses to determine the position in the change stream from which to start retrieving change records.
      Mass Ingestion Databases
      retrieves only the changes that were started after this date and time. If you enter a date and time earlier than the earliest date and time in the available archived logs, the job will fail.
    The default is
    Latest Available
    .
    The
    Initial Start Point for Incremental Load
    option pertains only to the initial run of a job. Thereafter, if you resume a stopped or aborted job, the job begins propagating source data from where it last left off.
  11. Under
    Custom Properties
    , you can specify custom properties at the direction of Informatica Global Customer Support to meet your special requirements. In this scenario, skip this step.
  12. Click
    Next
    .

0 COMMENTS

We’d like to hear from you!