Preface
Introduction to PowerExchange for Google BigQuery
- PowerExchange for Google BigQuery Overview
- Introduction to Google BigQuery
PowerExchange for Google BigQuery Configuration
- PowerExchange for Google BigQuery Configuration Overview
- Prerequisites
Configuring HTTP Proxy Options
- Configuring HTTP Proxy Options for the Mercury Client
- Configuring HTTP Proxy Options for the Data Integration Service
Google BigQuery Connections
PowerExchange for Google BigQuery Data Objects
PowerExchange for Google BigQuery Mappings
- PowerExchange for Google BigQuery Mappings Overview
- Mapping Validation and Run-time Environments
Google BigQuery Lookup
Google BigQuery Run-Time Processing
Appendix A: Google BigQuery Data Type Reference
- Data Type Reference Overview
- Google BigQuery and Transformation Data Types

PowerExchange for Google BigQuery User Guide

10.5.7
- 10.5.9
- 10.5.8
- 10.5.6
- 10.5.5
- 10.5.4
- 10.5.3
- 10.5.2
- 10.5
- 10.4.1
- 10.4.0

Back Next

Partitioning

When you read data from or write data to Google BigQuery, you can configure partitioning to optimize the mapping performance at run time. You can configure partitioning for Google BigQuery mappings that you run in the native or Spark engine. The partition type controls how the Data Integration Service distributes data among partitions at partition points. You can configure a partition key for a Google BigQuery data object that uses a simple or hybrid connection mode.

You can define the partition type as key range partitioning. To configure key range partitioning, open the Google BigQuery data object read operation, and select the

Key Range

partition type option on the

Run-time

tab.

When you configure key range partitioning, the Data Integration Service distributes rows of data based on a port or set of ports that you define as the partition key. You can define a range of values for each port. The Data Integration Service uses the key and ranges to send rows to the appropriate partition.

When you use a simple connection mode in a Google BigQuery connection, you can configure a partition key for fields of the following data types:

Integer

Float

You can preview data when you configure the key-range partitioning for a Google BigQuery data object that uses a simple, complex, or hybrid connection mode.

When you use a hybrid connection mode in a Google BigQuery connection, you can configure a partition key for fields of the following data types:

Integer

Float

Numeric

Timestamp

You cannot configure a partition key for Record data type columns and repeated columns.

You can configure dynamic partitioning for Google BigQuery data object write operation. To configure dynamic partitioning, open the Google BigQuery data object write operation, and select the

Dynamic

partition type option on the

Runtime

tab.

When you configure dynamic partitioning, the Data Integration Service determines the number of partitions that it must create at run time. It scales the number of partitions based on factors such as the maximum parallelism value defined for the Data Integration Service and the mapping, and the number of CPUs available on the nodes where the mappings run.

When you configure the advanced properties in the data object write operation, you cannot use the

Write Disposition

Write truncate

option for multiple partitions. In such case, set the Maximum Parallelism property for the mapping to 1.

When you configure key range partitioning, you cannot use the create target option.

Rename Saved Search

Table of Contents

PowerExchange for Google BigQuery User Guide

PowerExchange for Google BigQuery User Guide

Partitioning

Partitioning