Table of Contents

Search

  1. Preface
  2. Part 1: Introduction to Google BigQuery connectors
  3. Part 2: Data Integration with Google BigQuery V2 Connector
  4. Part 3: Data Integration with Google BigQuery Connector

Google BigQuery Connectors

Google BigQuery Connectors

Rules and guidelines for mapping and mapping tasks

Rules and guidelines for mapping and mapping tasks

Certain rules and guidelines apply when you configure a mapping and
mapping
tasks.
When you configure a Google BigQuery source or Google BigQuery target, adhere to the following guidelines:
  • When you enable cross-region replication in
    Google BigQuery
    , even though you can select regions from multiple continents while creating the dataset replicas in
    Google BigQuery
    , you are restricted to select regions within the same geographical area to stage data in Google Cloud Storage.
  • When you create a target at runtime, and the field name in the mapping contains special characters or spaces, the mapping fails.
  • When you read from Amazon S3 or Google Cloud Storage and use the update target operation to write to Google BigQuery, ensure that all the columns specified in the update columns field are mapped to the target.
  • When you write large datasets to a Google BigQuery target, increase the Java heap size in the JVM options for type DTM. Set JVMOption3 to -Xms1024m and JVMOption4 to -Xmx4096m in the
    System Configuration Details
    section of the Secure Agent and restart the Secure Agent.
  • When you use the Hosted Agent as the runtime environment in a
    mapping
    task and use a Hierarchy Builder or Hierarchy Parser transformation in a mapping, you must specify a storage path in Google Cloud Storage in the
    Schema Definition File Path
    field under the connection properties. You can then download the sample schema definition file for the Google BigQuery table from the specified storage path in Google Cloud Storage to a local machine.
  • Ensure that there is no blank space before you specify the staging directory for the
    Local Stage File Directory
    property for Google BigQuery source or target. Otherwise, the mapping fails.
  • If a Google BigQuery source contains the DATE, DATETIME, TIME, or TIMESTAMP data types and you create a Google BigQuery target at run time, the Secure Agent writes TIMESTAMP data to the target.
  • When you read JSON data from a MongoDB source table and write data to a column of Record data type in a Google BigQuery target table, you must specify a explicit value for columns that contain
    _id
    in the column name. Otherwise, the task fails with the following error:
    [ERROR] The [LOAD] job failed with the error - [JSON parsing error in row starting at position 0:
  • When you use a Google BigQuery V2 connection in simple mode and enable the
    Use Legacy SQL For SQL Override
    advanced source property, you can only map a single field of repeated data type.
  • When you use a Google BigQuery V2 connection in simple mode and configure an advanced data filter condition, ensure that you specify only the column name for the WHERE clause.
    For example, use the following format for the WHERE clause:
    SELECT <col1>, <col2> FROM `<projectID>.<datasetID>.<tableName>` WHERE <col2>='<value>'
  • When you use a Google BigQuery V2 connection in hybrid mode or complex mode, you must not specify a legacy SQL query for the
    SQL Override Query
    property. You must clear the
    Use Legacy SQL For SQL Override
    advanced source property. Otherwise, the mapping fails.
  • When you use a Google BigQuery V2 connection in simple mode and enable the
    Use Legacy SQL For Custom Query
    connection property to define a custom query to read data from a Google BigQuery materialized view, the Secure Agent fails to read the materialized view.
  • When you use a Google BigQuery V2 connection in simple mode and read data from a Google BigQuery materialized view as a single source object, you must clear the
    Use Legacy SQL For Custom Query
    option in the Google BigQuery V2 connection. Otherwise, the Secure Agent fails to read the materialized view.
  • When you specify a custom query to read data from Google BigQuery and the source table contains functions as columns, you must specify the alias name for the function.
  • If you specify an SQL override query and configure data filters, the mapping runs successfully but the Secure Agent does not consider the filter conditions.
  • When you select a Google BigQuery partitioned table as a source or target in a mapping, you cannot preview data.
  • When you select a Google BigQuery partitioned table as a source or target in a mapping, you must select the
    Use Legacy SQL For Custom Query
    connection property.
  • When you use month and time partitioned tables in a mapping in simple mode, you must unselect the
    Use Legacy SQL For Custom Query
    connection property.
  • You cannot configure a partitioned table with a partitioned filter in a mapping.
  • When you use Create New at Runtime to write data to Google BigQuery, the mapping task creates the physical target based on the fields from the upstream transformation in the initial run. But later if you delete the created target table and re-run the mapping task, the Secure Agent fails to create the target table.
  • When you use Create New at Runtime to write data to Google BigQuery and the source column name contains special characters, the task fails to create the target table.
  • When you perform an update, upsert, or delete operation on a Google BigQuery target without using Merge query and the dataset name, table name, or both starts with a number, the mapping fails. To run the mapping successfully, set the
    AllowQuotedIdentifier:true
    custom property in the
    Provide Optional Properties
    connection property.
  • When you read null and blank values from a Google BigQuery source in CSV format and perform update operation on a Google BigQuery target, the null values are written as empty strings without quotes and the blank values are written as empty strings with quotes. To treat null values from the Google BigQuery source as null values in the Google BigQuery target, set the
    ReadNullAsNullCSVStaging:true
    custom property in the
    Provide Optional Properties
    connection property.
  • When you specify a project name, dataset name, or table name in a Google BigQuery V2 mapping, ensure that you do not use reserved keywords.
  • When you configure a Google BigQuery target and set the
    Data Format of the staging file
    to
    Avro
    , ensure that you do not map fields of DateTime data type.
  • When you configure a Google BigQuery target and set the
    Data Format of the staging file
    to
    Parquet
    , ensure that you do not map fields of Time and DateTime data type.
  • When you configure a Google BigQuery target and provide
    DestinationTable
    as any existing table in the
    pre SQL Configuration
    and
    post SQL Configuration
    , use
    Write Disposition
    as
    Write append
    or
    Write truncate
    . Otherwise, the mapping fails.
  • When you run a mapping to read data of timestamp data type from a Google BigQuery source object, incorrect values are written to the target for certain timestamp values.
  • When you configure a mapping to read or write data of Record data type with nested fields, ensure that the nested fields do not have the same names.
  • When you set the
    Data Format of the staging file
    to
    Parquet
    and specify a Google BigQuery dataset name in the
    Source Staging Dataset
    source advanced property or
    Target Staging Dataset
    target advanced property to override the
    Staging Dataset Name
    connection property, ensure that you have the required Google Cloud Storage bucket permission to read data from or write data to Google BigQuery successfully from the staging file in Google Cloud Storage.
  • When you configure the
    Staging Dataset
    connection property and create a new target at runtime, ensure that the Google BigQuery dataset where you want to create the target table has the required permission to create the target successfully.
  • When a Google BigQuery source contains columns with the REQUIRED constraint and you use Create New at Runtime to write data, the columns are created with the NULLABLE constraint in the target table.
  • When you read data from a column of BigNumeric data type, ensure that you do not select the Avro staging file format. When you write data to a column of BigNumeric data type, ensure that you do not select the Avro or Parquet staging file format.
  • When you read data from a column of BigNumeric data type, you cannot specify data filters when you use a Google BigQuery connection in simple or complex mode.
  • To pass a column of BigNumeric data type from a Google BigQuery source to a Filter transformation, you must pass the data through an Expression transformation and convert the column to Decimal data type using the TO_DECIMAL() function and map the results to the Filter transformation. Ensure that you specify a precision of 28 and scale of 9.
  • When you configure the
    Billing Project ID
    source advanced properties and if you configure a SQL override, pre-SQL query, or post-SQL query for the Google BigQuery source, you must specify the
    Project ID
    value specified in the Google BigQuery V2 connection when you define the query.
  • When you configure the
    Billing Project ID
    source advanced properties, you cannot read data from multiple source objects.
  • When you configure the
    Billing Project ID
    target advanced properties, you cannot write data to a Google BigQuery target in CDC mode.
  • When you specify the Billing Project ID in the source and target advanced properties and run the mapping, few connector calls appear in the Connection project. However, billing occurs only in the Billing project.
  • When you provide an incorrect value for the Billing Project ID property in a mapping, the mapping fails with an irrelevant error message:
    [ERROR] Error occured while trying to Initialize Data Source Operation | com.informatica.cci.runtime.internal.utils.impl.CExceptionImpl: !com.infa.adapter.bigqueryv2.runtime.adapter.BigqueryRuntimeException: Truncate Target Failed 400 Bad Request
    POST https://bigquery.googleapis.com/bigquery/v2/projects/automation-project21/jobs
    {"code" : 400, "errors" : [ { "domain" : "global", "message" : "ProjectId and DatasetId must be non-empty", "reason" : "badRequest" } ], "message" : "ProjectId and DatasetId must be non-empty", "status" : "INVALID_ARGUMENT" }
  • When you use
    $$$SESSSTARTTIME
    variable in a custom query, the variable returns the session start time as a string value. Use the following syntax to convert the string values to timestamp or datetime:
    • SELECT PARSE_TIMESTAMP('%m/%d/%Y %H:%M:%E6S', '$$$SESSSTARTTIME' ) as timestamp
      --2022-10-06 18:53:28 UTC
    • SELECT cast(substr(cast('$$$SESSSTARTTIME' as string),0,19) as datetime FORMAT 'MM/DD/YYYY HH24:MI:SS') as datetime;
    • SELECT cast(substr(cast('$$$SESSSTARTTIME' as string),0,19) as timestamp FORMAT 'MM/DD/YYYY HH24:MI:SS') as timestamp;
    When you use
    SESSSTARTTIME
    variable in an Expression transformation, the variable returns the session start time as datetime data type.
    The Filter transformation uses system variable as
    SESSSTARTTIME
    .
    The Expression transformation uses system variable as
    SESSSTARTTIME
    .
  • When you use
    SESSSTARTTIME
    variable in a custom query without casting and if the source data type for
    SESSSTARTTIME
    is string, you might see difference in data format when you compare a mapping that runs with full
    SQL ELT optimization
    and a mapping that runs without
    SQL ELT optimization
    .
    For example,
    SESSSTARTTIME
    returns DateTime value in the
    MM/DD/YYYY HH24:MI:SS
    format when a mapping runs with full
    SQL ELT optimization
    . When you run the same mapping without
    SQL ELT optimization
    ,
    SESSSTARTTIME
    appends additional zeroes to the return value,
    MM/DD/YYYY HH24:MI:SS.000000
    .
  • When you run a mapping without
    SQL ELT optimization
    , you must use only the columns which are mapped in the Update Column field.
  • You must select the
    is expression variable
    in-out parameter to read a parameter value as an expression in a Filter transformation when you run a mapping without
    SQL ELT optimization
    .
  • The session log doesn't record the staging optimization error messages in the following scenarios:
    • A mapping with a cached lookup transformation and the staging optimization at source enabled runs without staging optimization.
    • A mapping runs without enabling staging optimization at target.
  • When you map a string data type in the source to a time data type in the target, and the data is in the format HH24:MI:SS.US, the mapping fails with the following error:
    ERROR: Invalid Data Type Conversion
  • When an IN function includes a null value in the list of values in an Expression transformation, the mapping fails with the following error:
    TE_7002 Transformation stopped due to a fatal error in the mapping. The expression [In(col1_string,null,'1','2','3','6',0)] contains the following errors [Function validation for [In] failed: [The arguments must be of the same datatype.]. <<PM Parse Error>> [In]: : invalid function reference ... >>>>In(col1_string,null,'1','2','3','6',0)<<<<].
    To run the mapping successfully, you can use TO_CHAR(null) function for the string data type and TO_DATE(null) function for the date/time data type.
    The mapping runs successfully if
    SQL ELT optimization
    is enabled.
  • When you use the IN function in an Expression transformation that includes a column with a null value in the list of values and there is no match, the function returns False instead of NULL.
    If the mapping is enabled with
    SQL ELT optimization
    , the function returns NULL.
  • If the source field name contains Unicode characters and you use the Create Target option to write to a Google BigQuery target, the mapping fails.
  • When you run a mapping to write data of the BigNumeric data type to a new target created at runtime, the scale is not honored in the data. The mapping processes the values up to 28 digits regardless of the scale.
  • When you map a BigNumeric data type to the Decimal data type with a precision greater than 28 digits, data truncation occurs at the target.
  • When you write data of the BigNumeric data type to a target created at runtime, BigNumeric defaults to Numeric data type in the target. You can use the edit metadata option to change the native data type from Numeric to BigNumeric.
  • When you configure an update, delete, or data driven operation to write data and if the metadata in the target does not match the source, the mapping behavior is not deterministic.
  • When you configure the
    REG_REPLACE(<column_name>, '.* ', <value_to_replace>)
    function in a mapping, the function replaces zero or more characters in the
    <column_name>
    argument with the value provided in the
    <value_to_replace>
    argument. However, in this scenario, the replaced value is applied twice in the column instead of a single entry when the criteria is met.
    However, a mapping enabled for
    SQL ELT optimization
    with the REG_REPLACE() function correctly replaces the value with a single entry.
  • When the arguments are null in the REG_REPLACE() function, the mapping fails with the following error:
    Transformation stopped due to a fatal error in the mapping. The expression [Reg_Replace(null,null,null)] contains the following errors [Function validation for [Reg_Replace] failed: [The subject and pattern arguments must be of the char datatype.].
    However, a mapping enabled for
    SQL ELT optimization
    with the REG_REPLACE() function runs successfully and returns a null value.
  • When you configure a mapping, you cannot use view or materialized view as a target object.
  • When you run an existing mapping to write a view as the target object, override the target object with the
    Target Table Name
    advanced property, and set the optional property
    DisableMappingDeployment:true
    at the
    Google BigQuery V2
    connection, the mapping fails with the following error:
    Operation failed: Internal Error Occurred. Contact Support : [Cannot create write operation. The node supports read operation only.].
    To run the mapping successfully, remove the optional property
    DisableMappingDeployment:true
    at the
    Google BigQuery V2
    connection.
  • You can perform the following operations with views in a mapping:
    • Read the data from
      Google BigQuery
      views in a Source transformation with staging optimization enabled.
    • Lookup the data from
      Google BigQuery
      views in a Lookup transformation.
  • You cannot enable staging optimization in a mapping if the
    Enable BigQuery Storage API
    option is selected in the
    Google BigQuery V2
    connection. A mapping with this configuration fails with the following error:
    READER_1_1_1> _38644 [2024-02-21 11:56:54.700] Plug-in #601601:Log Message from CCI : [DTM staging is not supported when [Enable BigQuery Storage API] is checked].
  • When you configure
    Simple
    as the connection mode in the
    Google BigQuery V2
    connection and
    Staging
    as the read mode, you cannot use views as a source or lookup object if the input data contains the Record data type.
    If you run a mapping with this configuration, the mapping fails with the following error:
    [ERROR] You cannot use [Standard] SQL view as it is not compatible with the Use Legacy SQL for Custom Query parameter selection in the Google BigQuery connection. Ensure that the view type matches with the selected SQL language in the connection.
    To run the mapping successfully, you can configure
    Hybrid
    as the connection mode in the
    Google BigQuery V2
    connection or
    Direct
    as the read mode.
  • When you configure a Lookup transformation to return either the first or last row and the incoming fields contain columns with the Record data type, the mapping fails with the following error:
    The [QUERY] job failed with the following error: 'ORDER BY' does not support expressions of type 'STRUCT<...>'
    This issue occurs when the configured connection mode is hybrid.
  • You cannot configure a partitioned table with a filter as a single object source and in the target operations. If you configure a mapping with these scenarios, the mapping fails with the following error:
    [ERROR] The [QUERY] job failed with the following error: [Cannot query over table <table name> without a filter over column(s) <column name>, <column name>, <column name> that can be used for partition elimination]
  • When you read columns with the Record data type, where the connection uses simple mode and the mapping uses staging as the read mode, you need to set the
    Use Legacy SQL for Custom Query
    property in the
    Google BigQuery V2
    connection to run the mapping successfully. Otherwise, the mapping fails with the following error:
    [ERROR] The [QUERY] job failed with the following error: [Invalid schema update. Cannot add fields (field: DateTime)]
  • To read the source data that contains Repeated data type columns with the simple connection mode, enable the
    Use Legacy SQL for Custom Query
    option in the connection properties. If you want to use the hybrid connection mode, clear the option so that the mapping uses the standard SQL instead of the legacy SQL to read Repeated data type columns.

0 COMMENTS

We’d like to hear from you!