Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Google BigQuery
  3. PowerExchange for Google BigQuery Configuration
  4. Google BigQuery Connections
  5. PowerExchange for Google BigQuery Data Objects
  6. PowerExchange for Google BigQuery Mappings
  7. Google BigQuery Lookup
  8. Google BigQuery Run-Time Processing
  9. Appendix A: Google BigQuery Data Type Reference

PowerExchange for Google BigQuery User Guide

PowerExchange for Google BigQuery User Guide

Advanced Properties

Advanced Properties

The Data Integration Service reads data from Google BigQuery based on the data object read operation.
The Developer tool displays advanced properties for the Google BigQuery data object operation in the Advanced view.
The following table describes the advanced properties for a Google BigQuery data object read operation:
Property
Description
Source Dataset ID
Optional. Overrides the Google BigQuery dataset name specified in the Google BigQuery data object read operation.
Source Table Name
Overrides the Google BigQuery table name specified in the Google BigQuery data object read operation.
Number of Rows to Read
Specifies the number of rows to read from the Google BigQuery source.
Enter a number greater than 0.
Spark Mode
Applies when you run a mapping on the Spark engine.
Specifies the mode of the Spark engine to read data from the Google BigQuery source.
You can select
Generic
or
Optimized
.
Default is Generic mode.
Select Optimized mode to increase the performance to read data from the Google BigQuery source.
For more information about the list of advanced properties that you can configure with Optimized Spark mode, see the Informatica How-To Library, "Configuring Optimized Spark Mode for PowerExchange for Google BigQuery Data Objects":
Allow Large Results
Determines whether PowerExchange for Google BigQuery must produce arbitrarily large result tables to query large source tables.
If you select this option, you must specify a destination table to store the query results.
Query Results Table Name
Required if you select the
Allow Large Results
option.
Specifies the destination table name to store the query results. If the table is not present in the dataset, PowerExchange for Google BigQuery creates the destination table with the name that you specify.
Job Poll Interval in Seconds
The number of seconds after which PowerExchange for Google BigQuery polls the status of the read job operation.
Enter a number greater than 0. Default is 10.
Read Mode
Specifies the read mode to read data from the Google BigQuery source.
You can select one the following read modes:
  • Direct. In direct mode, PowerExchange for Google BigQuery reads data directly from the Google BigQuery source table.
    When you use hybrid and complex connection mode, you cannot use direct mode to read data from the Google BigQuery source.
  • Staging. In staging mode, PowerExchange for Google BigQuery exports the data from the Google BigQuery source table into Google Cloud Storage.
    After the download is complete, PowerExchange for Google BigQuery downloads the data from Google Cloud Storage into the local stage file that you specify into the local stage file and then reads data from the local stage file.
Default is Direct mode.
Number of Threads for Downloading Staging Files
Specifies the number of files that PowerExchange for Google BigQuery downloads at a time to enable parallel download.
Enter a number greater than 0.
This property applies to staging mode.
Data Format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • CSV. Supports flat data.
    If you use hybrid and complex connection mode, you cannot use CSV format as the data format of the staging file.
Local Stage File Directory
Specifies the directory on your local machine where PowerExchange for Google BigQuery stores Google BigQuery source data temporarily before it reads the data.
This property applies to staging mode.
Staging File Name
Name of the staging file where data from the Google BigQuery source table is exported to Google Cloud Storage.
This property applies to staging mode.
Enable Staging File Compression
Indicates whether to compress the size of the staging file in Google Cloud Storage before PowerExchange for Google BigQuery reads data from the staging file.
You can enable staging file compression to reduce cost and transfer time.
This property applies to staging mode.
Persist Destination Table
Indicates whether PowerExchange for Google BigQuery must persist the query results table after it reads data from the query results table.
By default, PowerExchange for Google BigQuery deletes the query results table
pre SQL
SQL statement that you want to run before reading data from the source.
For example, if you want to select records in the database before you read the records from the table, specify the following pre SQL statement:
SELECT * FROM [api-project-80697026669:EMPLOYEE.DEPARTMENT] LIMIT 1000;
pre SQL Configuration
Specify a pre SQL configuration.
For example,
DestinationTable:PRESQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:False,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
post SQL
SQL statement that you want to run after reading data from the source.
For example, if you want to update records in a table after you read the records from a source table, specify the following post SQL statement:
UPDATE [api-project-80697026669.EMPLOYEE.PERSONS_TGT_DEL]
SET phoneNumber.number=1000011, phoneNumber.areaCode=100 where fullname='John Doe'
SET phoneNumber.number =1000011, phoneNumber.areaCode=100 where fullname='John Doe'
post SQL Configuration
Specify a post SQL configuration.
For example,
DestinationTable:POSTSQL_SRC,DestinationDataset:EMPLOYEE,
FlattenResults:True,WriteDisposition:WRITE_TRUNCATE,UseLegacySql:False
SQL Override Query
Overrides the default SQL query used to extract data from the Google BigQuery source.
When you specify a SQL override query, you must specify a dataset name in the SQL query.
Use Legacy SQL for SQL Override
Uses legacy SQL override query.
You can clear this option to define a standard SQL override query.
Partitions Per Core
Specifies the number of partitions to split the data.
Default is 1.
This property applies to Optimized spark mode.

0 COMMENTS

We’d like to hear from you!