Table of Contents

Search

  1. Preface
  2. Part 1: Introduction to Google BigQuery connectors
  3. Part 2: Data Integration with Google BigQuery V2 Connector
  4. Part 3: Data Integration with Google BigQuery Connector

Google BigQuery Connectors

Google BigQuery Connectors

Google BigQuery targets in mappings

Google BigQuery targets in mappings

To write data to a Google BigQuery target, configure a Google BigQuery object as the Target transformation in a mapping.
Specify the name and description of Google BigQuery target. Configure the target and advanced properties for the target object.
The following table describes the target properties that you can configure for a Google BigQuery target:
Property
Description
Connection
Name of the active Google BigQuery connection that is associated with a dataset.
Target Type
Type of the Google BigQuery target objects available.
You can write data to a single Google BigQuery target object or parameterize the object. You cannot write data to multiple objects.
Object
Name of the Google BigQuery target object based on the target type selected.
Create New at Runtime
Creates a target.
Enter a name for the target object and select the source fields that you want to use. By default, all source fields are used.
The target name can contain alphanumeric characters. You can use the following special characters in the file name:
., _, @, $, %
Google BigQuery Connector creates a new target table in the
Dataset ID
specified in the Google BigQuery connection.
Operation
You can select one the following operations:
  • Insert
  • Update
  • Upsert (Update or Insert)
  • Delete
If you use complex connection mode, you cannot configure update, upsert, and delete operations.
Update Columns
Specifies the temporary primary key columns to update, upsert or delete target data. If the Google BigQuery target does not include a primary key column, and the mapping performs an update, upsert, or delete task operation, click
Add
to add a temporary key.
You can select multiple columns. By default, no columns are specified.
The following table describes the advanced properties that you can configure for a Google BigQuery target:
Property
Description
Target Dataset ID
Optional. Overrides the Google BigQuery dataset name that you specified in the connection.
Target Table Name
Optional. Overrides the Google BigQuery target table name that you specified in the
Target
page of the
synchronization
task.
Create Disposition
Specifies whether Google BigQuery Connector must create the target table if it does not exist.
You can select one of the following values:
  • Create if needed. If the table does not exist, Google BigQuery Connector creates the table.
  • Create never. If the table does not exist, Google BigQuery Connector does not create the table and displays an error message.
Write Disposition
Specifies how Google BigQuery Connector must write data in bulk mode if the target table already exists.
You can select one of the following values:
  • Write append. If the target table exists, Google BigQuery Connector appends the data to the existing data in the table.
  • Write truncate. If the target table exists, Google BigQuery Connector overwrites the existing data in the table.
  • Write empty. If the target table exists and contains data, Google BigQuery Connector displays an error and does not write the data to the target. Google BigQuery Connector writes the data to the target only if the target table does not contain any data.
Write disposition is applicable for bulk mode.
Write disposition is applicable only when you perform an insert operation on a Google BigQuery target.
Write Mode
Specifies the mode to write data to the Google BigQuery target.
You can select one of the following modes:
  • Bulk. In bulk mode, Google BigQuery Connector first writes the data to a staging file in Google Cloud Storage. When the staging file contains all the data, Google BigQuery Connector loads the data from the staging file to the BigQuery target. Google BigQuery Connector then deletes the staging file unless you configure the task to persist the staging file.
  • Streaming. In streaming mode, Google BigQuery Connector directly writes data to the BigQuery target. Google BigQuery Connector writes the data into the target row by row.
Default is Bulk mode.
Streaming Template Table Suffix
Specify the suffix to add to the individual target tables that Google BigQuery Connector creates based on the template target table.
This property applies to streaming mode.
Rows per Streaming Request
Specifies the number of rows that Google BigQuery Connector streams to the BigQuery target for each request.
Default is 500 rows.
The maximum row size that Google BigQuery Connector can stream to the Google BigQuery target for each request is 10 MB.
This property applies to streaming mode.
Staging file name
Name of the staging file that Google BigQuery Connector creates in the Google Cloud Storage before it loads the data to the Google BigQuery target.
This property applies to bulk mode.
Data Format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • CSV. Supports flat data.
    In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ.
Persist Staging File After Loading
Indicates whether Google BigQuery Connector must persist the staging file in the Google Cloud Storage after it writes the data to the Google BigQuery target. You can persist the staging file if you want to archive the data for future reference.
By default, Google BigQuery Connector deletes the staging file in Google Cloud Storage.
This property applies to bulk mode.
Enable Staging File Compression
Select this option to compress the size of the staging file before Google BigQuery writes the data to the Google Cloud Storage and decompress the staging file before it loads the data to the Google BigQuery target.
You can enable staging file compression to reduce cost and transfer time.
Job Poll Interval in Seconds
The number of seconds after which Google BigQuery Connector polls the status of the write job operation.
Default is 10.
Number of Threads for Uploading Staging file
The number of files that Google BigQuery Connector must create to upload the staging file in bulk mode.
Local Stage File Directory
Specifies the directory on your local machine where Google BigQuery Connector stores the files temporarily before writing the data to the staging file in Google Cloud Storage.
This property applies to bulk mode.
Allow Quoted Newlines
Indicates whether Google BigQuery Connector must allow the quoted data sections with newline character in a .csv file.
Field Delimiter
Indicates whether Google BigQuery Connector must allow field separators for the fields in a .csv file.
Allow Jagged Rows
Indicates whether Google BigQuery Connector must accept the rows without trailing columns in a .csv file.
Pre SQL
SQL statement that you want to run before writing data to the target.
For example, if you want to select records from the database before you write the records into the table, specify the following pre SQL statement:
SELECT * FROM `api-project-80697026669.EMPLOYEE.RegionNation` LIMIT 1000
Pre SQL Configuration
Specify a pre SQL configuration.
For example,
DestinationTable:PRESQL_TGT2,DestinationDataset:EMPLOYEE,
FlattenResults:False,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
Post SQL
SQL statement that you want to run after writing the data into the target.
For example, if you want to update records in a table after you write the records into the target table, specify the following post SQL statement:
UPDATE [api-project-80697026669.EMPLOYEE.PERSONS_TGT_DEL]
SET phoneNumber.number =1000011, phoneNumber.areaCode=100 where fullname='John Doe'
Post SQL Configuration
Specify a post SQL configuration.
For example,
DestinationTable:POSTSQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:True,UseLegacySQL:False
Success File Directory
Not applicable for Google BigQuery Connector.
Error File Directory
Not applicable for Google BigQuery Connector.
Forward Rejected Rows
Not applicable for Google BigQuery Connector.

0 COMMENTS

We’d like to hear from you!