Table of Contents

Search

  1. Preface
  2. Getting started with PowerCenter modernization
  3. Repository Assessment
  4. Asset Conversion
  5. Bulk Metadata Update
  6. Jobs
  7. Configurations

PowerCenter Modernization

PowerCenter Modernization

Google BigQuery advanced properties

Google BigQuery advanced properties

The following tables lists the Google BigQuery properties that you can configure for a
Connection Map configuration
:

Object properties

The following table describes the Google BigQuery connection properties:
Connection property
Description
Source Dataset ID
Name of the dataset that contains the source table and target table that you want to connect to.
Default is
datasetID_changeit
.
Google BigQuery supports the datasets that reside only in the US region.

Source advanced properties

The following table describes the Google BigQuery source advanced properties:
Property name
Description
Source Dataset ID
Overrides the Google BigQuery dataset name that you specified in the connection.
Default is
$$sf_database
.
Source Staging Dataset
Overrides the Google BigQuery staging dataset name that you specified in the connection and the Source Dataset ID source advanced property.
Allow Large Results
Determines whether Google BigQuery Connector must produce arbitrarily large result tables to query large source tables. If you select this option, you must specify a destination table to store the query results.
Default is false.
Job Poll Interval In Seconds
The number of seconds after which Google BigQuery Connector polls the status of the read job operation.
Default is 10.
Read Mode
Specifies the read mode to read data from the Google BigQuery source.
You can select one the following read modes:
  • Direct. In direct mode, Google BigQuery Connector reads data directly from the Google BigQuery source table.
    When you use hybrid and complex connection mode, you cannot use direct mode to read data from the Google BigQuery source.
  • Staging. In staging mode, Google BigQuery Connector exports data from the Google BigQuery source into Google Cloud Storage. After the export is complete, Google BigQuery Connector downloads the data from Google Cloud Storage into the local stage file and then reads data from the local stage file.
Default is Staging mode.
Use EXPORT DATA Statement to stage
Uses the EXPORT DATA statement to export data from Google BigQuery to Google Cloud Storage.
If the query contains an ORDER BY clause, the specified order is maintained when you export the data.
This property applies to staging mode.
Default is false.
Number of Threads for Downloading Staging Files
Specifies the number of files that Google BigQuery Connector downloads at a time to enable parallel download.
This property applies to staging mode.
Default is 1.
Local Stage File Directory
Specifies the directory on your local machine where Google BigQuery Connector stores the Google BigQuery source data temporarily before it reads the data.
This property applies to staging mode.
Default is
$PMTempDir
.
Data format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • CSV. Supports flat data.
    In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ.
Default is
JSON
.
Enable Staging File Compression
Indicates whether to compress the size of the staging file in Google Cloud Storage before Google BigQuery Connector reads data from the staging file.
You can enable staging file compression to reduce cost and transfer time.
This property applies to staging mode.
Default is false.
Retry Options
Comma-separated list to specify the following retry options:
  • Retry Count. The number of retry attempts to read data from Google BigQuery.
  • Retry Interval. The time in seconds to wait between each retry attempt.
  • Retry Exceptions. The list of exceptions separated by pipe (|) character for which the retries are made.
Default is
$$gbq_retry_options
.
Use Legacy SQL for SQL Override
Indicates that the SQL Override query is specified in legacy SQL.
Use the following format to specify a legacy SQL query for the
SQL Override Query
property:
SELECT <Col1, Col2, Col3> FROM [projectID:datasetID.tableName]
Clear this option to define a standard SQL override query.
Use the following format to specify a standard SQL query for the
SQL Override Query
property:
SELECT * FROM `projectID.datasetID.tableName
`
Default is false.

Target advanced properties

The following table describes the Google BigQuery target advanced properties:
Property name
Description
Target Dataset ID
Overrides the Google BigQuery dataset name that you specified in the connection.
Default is
$$gbq_datasetID
.
Write Mode
Specifies the mode to write data to the Google BigQuery target.
You can select one of the following modes:
  • Bulk. Google BigQuery V2 Connector first writes the data to a staging file in Google Cloud Storage. When the staging file contains all the data, Google BigQuery V2 Connector loads the data from the staging file to the BigQuery target. Google BigQuery V2 Connector then deletes the staging file unless you configure the task to persist the staging file.
  • Streaming. Google BigQuery V2 Connector directly writes data to the BigQuery target. Google BigQuery V2 Connector writes the data into the target row by row.
  • CDC. Applies only when you capture changed data from a CDC source. In CDC mode, Google BigQuery V2 Connector captures changed data from any CDC source and writes the changed data to a Google BigQuery target table.
Default is Bulk mode.
Data format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • Avro
  • JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • Parquet
  • CSV. Supports flat data.
    In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ.
Only JSON format is applicable for mappings in advanced mode.
This property applies to bulk and CDC mode.
Avro and parquet format is not applicable when you perform a data driven operation.
Default is
JSON
.
Enable Staging File Compression
Select this option to compress the size of the staging file before Google BigQuery writes the data to the Google Cloud Storage and decompress the staging file before it loads the data to the Google BigQuery target.
You can enable staging file compression to reduce cost and transfer time.
Default is
false
.
Local Stage File Directory
Specifies the directory on your local machine where Google BigQuery V2 Connector stores the files temporarily before writing the data to the staging file in Google Cloud Storage.
This property applies to bulk mode.
Default is
$PMTempDir
.
Use Default Column Values
Applicable when the selected data format for the staging file is CSV when the mapping contains unconnected ports. Includes the default column values for the unconnected port from the staging file to create the target. This is applicable when you have defined the default constraint value in the Google BigQuery source column. When you do not enable this option, the agent creates a target only with the connected ports. The agent populates null or empty strings for unconnected ports.
Default is
true
.

Lookup advanced properties

The following table describes the Google BigQuery lookup transformation advanced properties:
Property name
Description
Source Dataset ID
Overrides the Google BigQuery dataset name that you specified in the connection.
Default is
$$gbq_datasetID
.
Source Staging Dataset
Overrides the Google BigQuery staging dataset name that you specified in the Lookup transformation.
Allow Large Results
Determines whether Google BigQuery V2 Connector creates arbitrarily large result tables to query large source tables.
If you select this option, you must specify a destination table to store the query results.
Default is
false
.
Job Poll Interval In Seconds
The number of seconds after which Google BigQuery V2 Connector polls the status of the read job operation.
Default is
10
.
Read Mode
Specifies the read mode to read data from the Google BigQuery source.
You can select one the following read modes:
  • Direct. In direct mode, Google BigQuery V2 Connector reads data directly from the Google BigQuery source table.
    When you use hybrid and complex connection mode, you cannot use direct mode to read data from the Google BigQuery source.
  • Staging. In staging mode, Google BigQuery V2 Connector exports data from the Google BigQuery source into Google Cloud Storage. After the export is complete, Google BigQuery V2 Connector downloads the data from Google Cloud Storage into the local stage file and then reads data from the local stage file.
Default is
Staging
mode.
Use EXPORT DATA Statement to stage
Uses the EXPORT DATA statement to export data from Google BigQuery to Google Cloud Storage.
If the query contains an ORDER BY clause, the specified order is maintained when you export the data.
This property applies to staging mode.
Default is
true
.
Number of Threads for Downloading Staging Files
Specifies the number of files that Google BigQuery Connector downloads at a time to enable parallel download.
This property applies to staging mode.
Default is
1
.
Local Stage File Directory
Specifies the directory on your local machine where Google BigQuery V2 Connector stores the Google BigQuery source data temporarily before it reads the data.
This property applies to staging mode.
Default is
$PMTempDir
.
Data format of the staging file
Specifies the data format of the staging file. You can select one of the following data formats:
  • Avro
  • JSON (Newline Delimited). Supports flat and record data with nested and repeated fields.
  • Parquet
  • CSV. Supports flat data.
    In a .csv file, columns of the Timestamp data type are represented as floating point numbers that cause the milliseconds value to differ.
Only JSON format is applicable for mappings in advanced mode.
This property applies to bulk and CDC mode.
Avro and parquet format is not applicable when you perform a data driven operation.
Default is
JSON
.
Enable Staging File Compression
Indicates whether to compress the size of the staging file in Google Cloud Storage before Google BigQuery V2 Connector reads data from the staging file.
You can enable staging file compression to reduce cost and transfer time.
This property applies to staging mode.
Default is
false
.
Retry Options
Comma-separated list to specify the following retry options:
  • Retry Count. The number of retry attempts to read data from Google BigQuery.
  • Retry Interval. The time in seconds to wait between each retry attempt.
  • Retry Exceptions. The list of exceptions separated by pipe (|) character for which the retries are made.
Default is
$$gbq_retry_options
.
Use Legacy SQL for SQL Override
Indicates that the SQL Override query is specified in legacy SQL.
Use the following format to specify a legacy SQL query for the SQL Override Query property:
SELECT <Col1, Col2, Col3> FROM [projectID:datasetID.tableName]
Clear this option to define a standard SQL override query.
Use the following format to specify a standard SQL query for the SQL Override Query property:
SELECT * FROM 'projectID.datasetID.tableName'

0 COMMENTS

We’d like to hear from you!