Google BigQuery Connectors

Back Next

Rules and guidelines for mapping and mapping tasks

Certain rules and guidelines apply when you configure a mapping and

mapping

tasks.

When you configure a Google BigQuery source or Google BigQuery target, adhere to the following guidelines:

When you enable cross-region replication in

Google BigQuery

, even though you can select regions from multiple continents while creating the dataset replicas in

Google BigQuery

, you are restricted to select regions within the same geographical area to stage data in Google Cloud Storage.

When you create a target at runtime, and the field name in the mapping contains special characters or spaces, the mapping fails.

When you read from Amazon S3 or Google Cloud Storage and use the update target operation to write to Google BigQuery, ensure that all the columns specified in the update columns field are mapped to the target.

When you write large datasets to a Google BigQuery target, increase the Java heap size in the JVM options for type DTM. Set JVMOption3 to -Xms1024m and JVMOption4 to -Xmx4096m in the

System Configuration Details

section of the Secure Agent and restart the Secure Agent.

When you use the Hosted Agent as the runtime environment in a

mapping

task and use a Hierarchy Builder or Hierarchy Parser transformation in a mapping, you must specify a storage path in Google Cloud Storage in the

Schema Definition File Path

field under the connection properties. You can then download the sample schema definition file for the Google BigQuery table from the specified storage path in Google Cloud Storage to a local machine.

Ensure that there is no blank space before you specify the staging directory for the

Local Stage File Directory

property for Google BigQuery source or target. Otherwise, the mapping fails.

If a Google BigQuery source contains the DATE, DATETIME, TIME, or TIMESTAMP data types and you create a Google BigQuery target at run time, the Secure Agent writes TIMESTAMP data to the target.

When you read JSON data from a MongoDB source table and write data to a column of Record data type in a Google BigQuery target table, you must specify a explicit value for columns that contain

_id

in the column name. Otherwise, the task fails with the following error:

[ERROR] The [LOAD] job failed with the error - [JSON parsing error in row starting at position 0:

When you use a Google BigQuery V2 connection in simple mode and enable the

Use Legacy SQL For SQL Override

advanced source property, you can only map a single field of repeated data type.

When you use a Google BigQuery V2 connection in simple mode and configure an advanced data filter condition, ensure that you specify only the column name for the WHERE clause.

For example, use the following format for the WHERE clause:

SELECT <col1>, <col2> FROM `<projectID>.<datasetID>.<tableName>` WHERE <col2>='<value>'

When you use a Google BigQuery V2 connection in hybrid mode or complex mode, you must not specify a legacy SQL query for the

SQL Override Query

property. You must clear the

Use Legacy SQL For SQL Override

advanced source property. Otherwise, the mapping fails.

When you use a Google BigQuery V2 connection in simple mode and enable the

Use Legacy SQL For Custom Query

connection property to define a custom query to read data from a Google BigQuery materialized view, the Secure Agent fails to read the materialized view.

When you use a Google BigQuery V2 connection in simple mode and read data from a Google BigQuery materialized view as a single source object, you must clear the

Use Legacy SQL For Custom Query

option in the Google BigQuery V2 connection. Otherwise, the Secure Agent fails to read the materialized view.

When you specify a custom query to read data from Google BigQuery and the source table contains functions as columns, you must specify the alias name for the function.

If you specify an SQL override query and configure data filters, the mapping runs successfully but the Secure Agent does not consider the filter conditions.

When you select a Google BigQuery partitioned table as a source or target in a mapping, you cannot preview data.

When you select a Google BigQuery partitioned table as a source or target in a mapping, you must select the

Use Legacy SQL For Custom Query

connection property.

When you use month and time partitioned tables in a mapping in simple mode, you must unselect the

Use Legacy SQL For Custom Query

connection property.

You cannot configure a partitioned table with a partitioned filter in a mapping.

When you use Create New at Runtime to write data to Google BigQuery, the mapping task creates the physical target based on the fields from the upstream transformation in the initial run. But later if you delete the created target table and re-run the mapping task, the Secure Agent fails to create the target table.

When you use Create New at Runtime to write data to Google BigQuery and the source column name contains special characters, the task fails to create the target table.

When you perform an update, upsert, or delete operation on a Google BigQuery target without using Merge query and the dataset name, table name, or both starts with a number, the mapping fails. To run the mapping successfully, set the

AllowQuotedIdentifier:true

custom property in the

Provide Optional Properties

connection property.

When you read null and blank values from a Google BigQuery source in CSV format and perform update operation on a Google BigQuery target, the null values are written as empty strings without quotes and the blank values are written as empty strings with quotes. To treat null values from the Google BigQuery source as null values in the Google BigQuery target, set the

ReadNullAsNullCSVStaging:true

custom property in the

Provide Optional Properties

connection property.

When you specify a project name, dataset name, or table name in a Google BigQuery V2 mapping, ensure that you do not use reserved keywords.

When you configure a Google BigQuery target and set the

Data Format of the staging file

Avro

, ensure that you do not map fields of DateTime data type.

When you configure a Google BigQuery target and set the

Data Format of the staging file

Parquet

, ensure that you do not map fields of Time and DateTime data type.

When you configure a Google BigQuery target and provide

DestinationTable

as any existing table in the

pre SQL Configuration

and

post SQL Configuration

, use

Write Disposition

as Write append or Write truncate. Otherwise, the mapping fails.

When you run a mapping to read data of timestamp data type from a Google BigQuery source object, incorrect values are written to the target for certain timestamp values.

When you configure a mapping to read or write data of Record data type with nested fields, ensure that the nested fields do not have the same names.

When you set the

Data Format of the staging file

Parquet

and specify a Google BigQuery dataset name in the

Source Staging Dataset

source advanced property or

Target Staging Dataset

target advanced property to override the

Staging Dataset Name

connection property, ensure that you have the required Google Cloud Storage bucket permission to read data from or write data to Google BigQuery successfully from the staging file in Google Cloud Storage.

When you configure the

Staging Dataset

connection property and create a new target at runtime, ensure that the Google BigQuery dataset where you want to create the target table has the required permission to create the target successfully.

When a Google BigQuery source contains columns with the REQUIRED constraint and you use Create New at Runtime to write data, the columns are created with the NULLABLE constraint in the target table.

When you read data from a column of BigNumeric data type, ensure that you do not select the Avro staging file format. When you write data to a column of BigNumeric data type, ensure that you do not select the Avro or Parquet staging file format.

When you read data from a column of BigNumeric data type, you cannot specify data filters when you use a Google BigQuery connection in simple or complex mode.

To pass a column of BigNumeric data type from a Google BigQuery source to a Filter transformation, you must pass the data through an Expression transformation and convert the column to Decimal data type using the TO_DECIMAL() function and map the results to the Filter transformation. Ensure that you specify a precision of 28 and scale of 9.

When you configure the

Billing Project ID

source advanced properties and if you configure a SQL override, pre-SQL query, or post-SQL query for the Google BigQuery source, you must specify the

Project ID

value specified in the Google BigQuery V2 connection when you define the query.

When you configure the

Billing Project ID

source advanced properties, you cannot read data from multiple source objects.

When you configure the

Billing Project ID

target advanced properties, you cannot write data to a Google BigQuery target in CDC mode.

When you specify the Billing Project ID in the source and target advanced properties and run the mapping, few connector calls appear in the Connection project. However, billing occurs only in the Billing project.

When you provide an incorrect value for the Billing Project ID property in a mapping, the mapping fails with an irrelevant error message:

[ERROR] Error occured while trying to Initialize Data Source Operation | com.informatica.cci.runtime.internal.utils.impl.CExceptionImpl: !com.infa.adapter.bigqueryv2.runtime.adapter.BigqueryRuntimeException: Truncate Target Failed 400 Bad Request

POST https://bigquery.googleapis.com/bigquery/v2/projects/automation-project21/jobs

{"code" : 400, "errors" : [ { "domain" : "global", "message" : "ProjectId and DatasetId must be non-empty", "reason" : "badRequest" } ], "message" : "ProjectId and DatasetId must be non-empty", "status" : "INVALID_ARGUMENT" }

When you use

$$$SESSSTARTTIME

variable in a custom query, the variable returns the session start time as a string value. Use the following syntax to convert the string values to timestamp or datetime:

SELECT PARSE_TIMESTAMP('%m/%d/%Y %H:%M:%E6S', '$$$SESSSTARTTIME' ) as timestamp

--2022-10-06 18:53:28 UTC

SELECT cast(substr(cast('$$$SESSSTARTTIME' as string),0,19) as datetime FORMAT 'MM/DD/YYYY HH24:MI:SS') as datetime;

SELECT cast(substr(cast('$$$SESSSTARTTIME' as string),0,19) as timestamp FORMAT 'MM/DD/YYYY HH24:MI:SS') as timestamp;

When you use

SESSSTARTTIME

variable in an Expression transformation, the variable returns the session start time as datetime data type.

The Filter transformation uses system variable as

SESSSTARTTIME

The Expression transformation uses system variable as

SESSSTARTTIME

When you use

SESSSTARTTIME

variable in a custom query without casting and if the source data type for

SESSSTARTTIME

is string, you might see difference in data format when you compare a mapping that runs with full

SQL ELT optimization

and a mapping that runs without

SQL ELT optimization

For example,

SESSSTARTTIME

returns DateTime value in the

MM/DD/YYYY HH24:MI:SS

format when a mapping runs with full

SQL ELT optimization

. When you run the same mapping without

SQL ELT optimization

SESSSTARTTIME

appends additional zeroes to the return value,

MM/DD/YYYY HH24:MI:SS.000000

When you run a mapping without

SQL ELT optimization

, you must use only the columns which are mapped in the Update Column field.

You must select the

is expression variable

in-out parameter to read a parameter value as an expression in a Filter transformation when you run a mapping without

SQL ELT optimization

The session log doesn't record the staging optimization error messages in the following scenarios:

A mapping with a cached lookup transformation and the staging optimization at source enabled runs without staging optimization.

A mapping runs without enabling staging optimization at target.

When you map a string data type in the source to a time data type in the target, and the data is in the format HH24:MI:SS.US, the mapping fails with the following error:

ERROR: Invalid Data Type Conversion

When an IN function includes a null value in the list of values in an Expression transformation, the mapping fails with the following error:

TE_7002 Transformation stopped due to a fatal error in the mapping. The expression [In(col1_string,null,'1','2','3','6',0)] contains the following errors [Function validation for [In] failed: [The arguments must be of the same datatype.]. <<PM Parse Error>> [In]: : invalid function reference ... >>>>In(col1_string,null,'1','2','3','6',0)<<<<].

To run the mapping successfully, you can use TO_CHAR(null) function for the string data type and TO_DATE(null) function for the date/time data type.

The mapping runs successfully if

SQL ELT optimization

is enabled.

When you use the IN function in an Expression transformation that includes a column with a null value in the list of values and there is no match, the function returns False instead of NULL.

If the mapping is enabled with

SQL ELT optimization

, the function returns NULL.

If the source field name contains Unicode characters and you use the Create Target option to write to a Google BigQuery target, the mapping fails.

When you run a mapping to write data of the BigNumeric data type to a new target created at runtime, the scale is not honored in the data. The mapping processes the values up to 28 digits regardless of the scale.

When you map a BigNumeric data type to the Decimal data type with a precision greater than 28 digits, data truncation occurs at the target.

When you write data of the BigNumeric data type to a target created at runtime, BigNumeric defaults to Numeric data type in the target. You can use the edit metadata option to change the native data type from Numeric to BigNumeric.

When you configure an update, delete, or data driven operation to write data and if the metadata in the target does not match the source, the mapping behavior is not deterministic.

When you configure the

REG_REPLACE(<column_name>, '.* ', <value_to_replace>)

function in a mapping, the function replaces zero or more characters in the

<column_name>

argument with the value provided in the

<value_to_replace>

argument. However, in this scenario, the replaced value is applied twice in the column instead of a single entry when the criteria is met.

However, a mapping enabled for

SQL ELT optimization

with the REG_REPLACE() function correctly replaces the value with a single entry.

When the arguments are null in the REG_REPLACE() function, the mapping fails with the following error:

Transformation stopped due to a fatal error in the mapping. The expression [Reg_Replace(null,null,null)] contains the following errors [Function validation for [Reg_Replace] failed: [The subject and pattern arguments must be of the char datatype.].

However, a mapping enabled for

SQL ELT optimization

with the REG_REPLACE() function runs successfully and returns a null value.

When you configure a mapping, you cannot use view or materialized view as a target object.

When you run an existing mapping to write a view as the target object, override the target object with the

Target Table Name

advanced property, and set the optional property

DisableMappingDeployment:true

at the

Google BigQuery V2

connection, the mapping fails with the following error:

Operation failed: Internal Error Occurred. Contact Support : [Cannot create write operation. The node supports read operation only.].

To run the mapping successfully, remove the optional property

DisableMappingDeployment:true

at the

Google BigQuery V2

connection.

You can perform the following operations with views in a mapping:

Read the data from

Google BigQuery

views in a Source transformation with staging optimization enabled.

Lookup the data from

Google BigQuery

views in a Lookup transformation.

You cannot enable staging optimization in a mapping if the

Enable BigQuery Storage API

option is selected in the

Google BigQuery V2

connection. A mapping with this configuration fails with the following error:

READER_1_1_1> _38644 [2024-02-21 11:56:54.700] Plug-in #601601:Log Message from CCI : [DTM staging is not supported when [Enable BigQuery Storage API] is checked].

When you configure

Simple

as the connection mode in the

Google BigQuery V2

connection and

Staging

as the read mode, you cannot use views as a source or lookup object if the input data contains the Record data type.

If you run a mapping with this configuration, the mapping fails with the following error:

[ERROR] You cannot use [Standard] SQL view as it is not compatible with the Use Legacy SQL for Custom Query parameter selection in the Google BigQuery connection. Ensure that the view type matches with the selected SQL language in the connection.

To run the mapping successfully, you can configure

Hybrid

as the connection mode in the

Google BigQuery V2

connection or

Direct

as the read mode.

When you configure a Lookup transformation to return either the first or last row and the incoming fields contain columns with the Record data type, the mapping fails with the following error:

The [QUERY] job failed with the following error: 'ORDER BY' does not support expressions of type 'STRUCT<...>'

This issue occurs when the configured connection mode is hybrid.

You cannot configure a partitioned table with a filter as a single object source and in the target operations. If you configure a mapping with these scenarios, the mapping fails with the following error:

[ERROR] The [QUERY] job failed with the following error: [Cannot query over table <table name> without a filter over column(s) <column name>, <column name>, <column name> that can be used for partition elimination]

When you read columns with the Record data type, where the connection uses simple mode and the mapping uses staging as the read mode, you need to set the

Use Legacy SQL for Custom Query

property in the

Google BigQuery V2

connection to run the mapping successfully. Otherwise, the mapping fails with the following error:

[ERROR] The [QUERY] job failed with the following error: [Invalid schema update. Cannot add fields (field: DateTime)]

To read the source data that contains Repeated data type columns with the simple connection mode, enable the

Use Legacy SQL for Custom Query

option in the connection properties. If you want to use the hybrid connection mode, clear the option so that the mapping uses the standard SQL instead of the legacy SQL to read Repeated data type columns.

Rename Saved Search

Table of Contents

Google BigQuery Connectors

Google BigQuery Connectors

Rules and guidelines for mapping and mapping tasks

Rules and guidelines for mapping and mapping tasks