Table of Contents

Search

  1. Preface
  2. Mass Ingestion Applications

Mass Ingestion Applications

Mass Ingestion Applications

Configuring a Salesforce source

Configuring a Salesforce source

On the
Source
page of the
application ingestion
task wizard, you can specify the objects that you want to ingest and configure the advanced properties for your Salesforce source. You can also specify custom properties to address unique environments and special use cases.
  1. For initial load tasks and combined initial and incremental load tasks, select the type of Salesforce API that you want to use to retrieve the source data.
    Options are:
    • Standard (REST) API
      : Replicates source fields of Base64 data type. Informatica recommends that you use the Bulk API 2.0 unless you want to ingest fields of Base64 data type or objects that are not supported by Bulk API 2.0 during initial loading of data. All incremental load activities use only the standard REST API.
    • Bulk API 2.0
      : Excludes replication of source fields of Base64 data type. Bulk API 2.0 is the default API for initial load tasks and the initial load of the combined initial and incremental load tasks.
    By default, incremental load tasks can capture and replicate change data from source fields of Base64 data type.
  2. In the
    Object Selection
    section, select
    Select All
    only if you want to select all source objects and fields for data replication. You cannot edit the selection in subsequent fields.
    The
    Objects Selected
    field shows the count of all selected objects. If you have many source objects, the interface might take a long time to fetch them.
    Alternatively, you can use rules to define a subset of source objects to replicate.
  3. To use rules to select the source objects, make sure that the
    Select All
    check box is cleared and then add rules.
    When rule-based selection is used, you can refine the set of selected objects by object under
    Object View
    and also set an option for trimming spaces in character data.
    The default "Include *" rule selects all source objects accessed with the selected connection. To see how many objects are selected by this rule, click the Refresh icon to display the object count in
    Total Objects Selected
    and click
    Apply Rules
    to see the object count in
    Object View
    .
    To add a rule:
    1. Click the Add Rule (+) icon above the first table under
      Rules
      . A row is added to define a new rule.
    2. In the
      Object Rule
      field, select
      Include
      or
      Exclude
      to create an inclusion or exclusion rule, respectively.
    3. In the
      Condition
      column, enter an object name or an object-name mask that includes one or more wildcards to identify the source objects to include in or exclude from object selection. Use the following guidelines:
      • A mask can contain one or both of the following wildcards: an asterisk (*) wildcard to represent one or more characters and a question mark (?) wildcard to represent a single character. A wildcard can occur multiple times in a mask value and can occur anywhere in the value.
      • The task wizard is case sensitive. Enter the object names or masks in the case with which the objects were defined.
      • Do not include delimiters such as quotation marks or brackets, even if the source uses them.
      • If an object name includes special characters such as a backslash (\), asterisk(*), dollar sign ($), caret (^), or question mark (?), escape each special character with a backslash (\) when you enter the rule.
    4. Define additional rules as needed.
      The rules are processed in the order in which they're listed, from top to bottom. Use the arrow icons to change the order.
    5. When finished, click
      Apply Rules
      .
      Tip:
      Click the Refresh icon to the right of the
      Updated
      timestamp to refresh the
      Objects Affected
      and
      Total Objects Selected
      counts.
      After you apply rules, if you add, delete, or change rules, you must click
      Apply Rules
      again. Click the Refresh icon to update the object counts. If you delete all rules without clicking
      Apply Rules
      , a validation error occurs at deployment, even if the
      Object View
      list still lists objects. If you switch to
      Select All
      , the rules no longer appear.
  4. To perform trim actions on the fields of the source objects that were selected based on rules, create field action rules.
    Perform the following steps to create a field action rule:
    1. Select
      Field Action
      as the rule type.
    2. From the adjacent list, select one of the following action types:
      • LTRIM
        . Trims spaces to the left of character field values.
      • RTRIM
        . Trims spaces to the right of character field values.
      • TRIM
        . Trims spaces to the left of and to the right of character field values.
    3. In the condition field, enter a field name or a field-name mask that includes one or more asterisk (*) or question mark (?) wildcards. The value that you enter is matched against fields of the selected source objects to identify the fields to which the action applies.
    4. Click
      Add Rule
      .
    • You cannot use field action rules to trim the spaces in rich text area fields of Salesforce because Salesforce uses control characters instead of empty values to represent the spaces.
    • You can define multiple rules for different action types or for the same action type with different conditions. The field action rules are processed in the order in which they are listed in the
      Rules
      list. The rule at the top of the list is processed first. You can use the arrow icons to change the order in which the rules are listed.
  5. Under
    Object View
    , view the selected objects, including the number of fields in each object and the field names and data types.
    • If you selected
      Select All
      , the list of objects are view only.
    • If you applied rules, you can refine the set of selected objects by clicking the check box next to individual objects. Clear any objects that you do not want to replicate, or select additional objects to replicate. Click the Refresh icon to update the selected objects count.
    • You can also individually clear or reselect the fields in a selected source object. To view or change the fields from which data will be replicated for a selected object, click the highlighted number of fields in the
      Fields
      column. The field names and data types are displayed to the right. By default, all the fields are selected for a selected source object. To clear a column or reselect it, click the check box next to the field name. You can't clear a primary key column.
    • To search for objects and fields, in the drop-down list above
      Fields
      , select
      Object Name
      ,
      Fields
      , or
      All
      and then enter a search string in the
      Find
      box and click
      Search
      . You can include a single asterisk (*) wildcard at the beginning or end of the string.
    • The first time you change a check box setting for an object, the rules are no longer in effect. The selections under
      Object View
      take precedence. However, if you click the Add Rule (+) icon again, the objects that you deselected or selected individually are reflected as new rules in the Rules list and the rules once again take precedence. If you want to return to the
      Object View
      list, click
      Apply Rules
      again.
    • If you rename a column while it is in an
      Up and Running
      state, the new name appears in the column selection, but it's not updated on the
      View
      tab.
    • If you select fields individually, the resulting set of fields is fixed and will not be updated by any schema change, regardless of the schema drift settings.
      For example, if a source field is added or renamed, that field is silently excluded from CDC processing because it's not in the list of selected fields. However, if a selected field is dropped on the source, the schema drift Drop Field option controls how it's handled. The dropped field operation is not reflected in the list of fields until you apply the rules again.
    • For combined initial and incremental load jobs, if you add to or modify a field selection, and redeploy the task, then a resync operation is triggered automatically. The resync is required to ensure that the target table has the same values as the source. For incremental load jobs, which do not have a resync option, modifying a field selection results in an error. If you modify the field selection of a deployed job and then redeploy it, the source and target will no longer match.
      If you remove a selected field, a resync operation is not triggered, and no error is reported. Removal of a selected field is treated the same as a drop field event, triggering your drop field schema drift settings for the task.
  6. To download a list of source objects that match the selection rules, perform the following steps:
    1. From the
      List Objects by Rule Type
      list, select the type of selection rule for which you want to download the list of selected source objects.
    2. If you want to include the fields in the list, select
      Include Fields
      .
    3. Click the Download icon.
      The list of source objects that match the selection rules is downloaded to your local drive.
      The information in the downloaded
      is in the following format:
      status
      ,
      object_name
      ,
      object_type
      ,
      field_name
      ,
      comment
      The following table describes the information in the downloaded file:
      Field
      Description
      status
      Indicates whether
      Mass Ingestion Applications
      includes or excludes the source object from processing. The possible values are:
      • E
        . The object is excluded from processing by an
        Exclude
        rule.
      • I
        . The object is included for processing.
      • X
        . The object is excluded from processing even though it matches the selection rules. The comment field in the file provides details on why the object is excluded.
      object_name
      Name of the source object.
      object_type
      Type of the source object. The possible values are:
      • O
        : Indicates an object.
      • F
        : Indicates a field.
      field_name
      Name of the source field. This information appears only if you selected the
      Include Fields
      check box before downloading the list.
      comment
      Reason why a source object is excluded from processing even though it matches the selection rules.
  7. Expand the
    Advanced
    section.
  8. For incremental load tasks, in the
    Initial Start Point for Incremental Load
    field, specify the point in the source data stream from which the ingestion job associated with the
    application ingestion
    task starts extracting change records.
    You must specify the date and time in Greenwich Mean Time (GMT).
  9. For incremental load tasks and combined initial and incremental load tasks, in the
    CDC Interval
    field, specify the time interval in which the
    application ingestion
    job runs to retrieve the change records for incremental load. The default interval is 5 minutes.
  10. In the
    Fetch Size
    field, enter the number of records that the
    application ingestion
    job associated with the task reads at a time from the source.
    The default value for initial load operations is 50000 and the default value for incremental load operations is 2000.
    For combined initial and incremental load tasks, you must specify the fetch size separately for initial load operations and incremental load operations.
  11. For initial load and combined initial and incremental load tasks, select
    Include Archived and Deleted Rows
    to replicate the archived and soft-deleted rows from the source during the initial loading of data.
  12. For initial load and combined initial and incremental load tasks, select
    Enable Partitioning
    to partition the source objects for initial loading. In the
    Chunk Size
    field, enter the number of records to be processed in a single partition. Based on the chunk size, bulk jobs are created in Salesforce. The default value is 50000 and the minimum value is 100.
    When you partition an object, the application ingestion job processes the records for each partition in parallel. Mass Ingestion Applications determine the range of partitions by equal distribution of primary key values of an object.
    You can partition the objects only if you select
    Bulk API 2.0
    as the Salesforce API.
  13. Select
    Include Base64 Fields
    to replicate the source fields of Base64 data type.
    • You can replicate the Base64 fields only if you select
      Standard (REST) API
      as the Salesforce API.
    • Replication of Base64 data might slow down the initial load operation of the application ingestion job.
  14. If you selected the
    Include Base64 Fields
    check box, in the
    Maximum Base64 Body Size
    field, specify the body size for Base64 encoded data. The default body size for Base64 encoded data is 7 MB.
  15. In the
    Custom Properties
    section, you can specify custom properties that Informatica provides for special cases. To add a property, add the property name and value, and then click
    Add Property
    .
    The custom properties are usually configured to address unique environments and special use cases.
    Specify the custom properties only at the direction of Informatica Global Customer Support.
  16. Click
    Next
    .

0 COMMENTS

We’d like to hear from you!