Table of Contents

Search

  1. Preface
  2. Part 1: Introduction to Google BigQuery connectors
  3. Part 2: Data Integration with Google BigQuery V2 Connector
  4. Part 3: Data Integration with Google BigQuery Connector

Google BigQuery Connectors

Google BigQuery Connectors

Google BigQuery sources in mappings

Google BigQuery sources in mappings

To read data from Google BigQuery, configure a Google BigQuery object as the Source transformation in a mapping.
Specify the name and description of the Google BigQuery source. Configure the source, query options, and advanced properties for the source object.
The following table describes the source properties that you can configure for a Google BigQuery source:
Property
Description
Connection
Name of the active Google BigQuery source connection.
Source Type
Type of the Google BigQuery source objects available.
You can read data from a single Google BigQuery source object or parameterize the object. You cannot read data from multiple objects.
Object
Name of the Google BigQuery source object based on the source type selected.
Filter
Configure a simple filter or an advanced filter to remove rows at the source. You can improve efficiency by filtering early in the data flow.
A simple filter includes a field name, operator, and value. Use an advanced filter to define a more complex filter condition, which can include multiple conditions using the AND or OR logical operators.
The following table describes the advanced properties that you can configure for a Google BigQuery source:
Property
Description
Source Dataset ID
Optional. Overrides the Google BigQuery dataset name that you specified in the connection.
Number of Rows to Read
Specifies the number of rows to read from the Google BigQuery source table.
Allow Large Results
Determines whether Google BigQuery Connector must produce arbitrarily large result tables to query large source tables.
If you select this option, you must specify a destination table to store the query results.
Query Results Table Name
Required if you select the
Allow Large Results
option.
Specifies the destination table name to store the query results. If the table is not present in the dataset, Google BigQuery Connector creates the destination table with the name that you specify.
Job Poll Interval in Seconds
The number of seconds after which Google BigQuery Connector polls the status of the read job operation.
Default is 10.
Read Mode
Specifies the read mode to read data from the Google BigQuery source.
You can select one the following read modes:
  • Direct. In direct mode, Google BigQuery Connector reads data directly from the Google BigQuery source table.
    When you use hybrid and complex connection mode, you cannot use direct mode to read data from the Google BigQuery source.
  • Staging. In staging mode, Google BigQuery Connector exports data from the Google BigQuery source into Google Cloud Storage. After the export is complete, Google BigQuery Connector downloads the data from Google Cloud Storage into the local stage file and then reads data from the local stage file.
Default is Direct mode.
Number of Threads for Downloading Staging Files
Specifies the number of files that Google BigQuery Connector downloads at a time to enable parallel download.
This property applies to staging mode.
Data Format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • CSV. Supports flat data.
    In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ.
Avro format is not applicable for Google BigQuery Connector.
Local Stage File Directory
Specifies the directory on your local machine where Google BigQuery Connector stores the Google BigQuery source data temporarily before it reads the data.
This property applies to staging mode.
Staging File Name
Name of the staging file where data from the Google BigQuery source table is exported to Google Cloud Storage.
This property applies to staging mode.
Enable Staging File Compression
Indicates whether to compress the size of the staging file in Google Cloud Storage before Google BigQuery Connector reads data from the staging file.
You can enable staging file compression to reduce cost and transfer time.
This property applies to staging mode.
Persist Destination Table
Indicates whether Google BigQuery Connector must persist the query results table after it reads data from the query results table.
By default, Google BigQuery Connector deletes the query results table.
pre SQL
SQL statement that you want to run before reading data from the source.
For example, if you want to select records in the database before you read the records from the table, specify the following pre SQL statement:
SELECT * FROM [api-project-80697026669:EMPLOYEE.DEPARTMENT] LIMIT 1000;
pre SQL Configuration
Specify a pre SQL configuration.
For example,
DestinationTable:PRESQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:False,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
post SQL
SQL statement that you want to run after reading data from the source.
For example, if you want to update records in a table after you read the records from a source table, specify the following post SQL statement:
UPDATE [api-project-80697026669.EMPLOYEE.PERSONS_TGT_DEL]
SET phoneNumber.number=1000011, phoneNumber.areaCode=100 where fullname='John Doe'
post SQL Configuration
Specify a post SQL configuration.
For example,
DestinationTable:POSTSQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:True,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
You can set the tracing level in the advanced properties session to determine the amount of details that logs contain.
The following table describes the tracing levels that you can configure:
Property
Description
Terse
The Secure Agent logs initialization information, error messages, and notification of rejected data.
Normal
The Secure Agent logs initialization and status information, errors encountered, and skipped rows due to transformation row errors. Summarizes session results, but not at the level of individual rows.
Verbose Initialization
In addition to normal tracing, the Secure Agent logs additional initialization details, names of index and data files used, and detailed transformation statistics.
Verbose Data
In addition to verbose initialization tracing, the Secure Agent logs each row that passes into the mapping. Also notes where the Secure Agent truncates string data to fit the precision of a column and provides detailed transformation statistics.
When you configure the tracing level to verbose data, the Secure Agent writes row data for all rows in a block when it processes a transformation.

0 COMMENTS

We’d like to hear from you!