Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Google BigQuery
  3. PowerExchange for Google BigQuery Configuration
  4. Configuring HTTP Proxy Options
  5. Google BigQuery Connections
  6. PowerExchange for Google BigQuery Data Objects
  7. PowerExchange for Google BigQuery Mappings
  8. Google BigQuery Lookup
  9. Google BigQuery Run-Time Processing
  10. Appendix A: Google BigQuery Data Type Reference

PowerExchange for Google BigQuery User Guide

PowerExchange for Google BigQuery User Guide

Partitioning

Partitioning

When you read data from or write data to Google BigQuery, you can configure partitioning to optimize the mapping performance at run time. You can configure partitioning for Google BigQuery mappings that you run in the native or Spark engine. The partition type controls how the Data Integration Service distributes data among partitions at partition points. You can configure a partition key for a Google BigQuery data object that uses a simple or hybrid connection mode.
You can define the partition type as key range partitioning. To configure key range partitioning, open the Google BigQuery data object read operation, and select the
Key Range
partition type option on the
Run-time
tab.
When you configure key range partitioning, the Data Integration Service distributes rows of data based on a port or set of ports that you define as the partition key. You can define a range of values for each port. The Data Integration Service uses the key and ranges to send rows to the appropriate partition.
When you use a simple connection mode in a Google BigQuery connection, you can configure a partition key for fields of the following data types:
  • Integer
  • Float
You can preview data when you configure the key-range partitioning for a Google BigQuery data object that uses a simple, complex, or hybrid connection mode.
When you use a hybrid connection mode in a Google BigQuery connection, you can configure a partition key for fields of the following data types:
  • Integer
  • Float
  • Numeric
  • Timestamp
You cannot configure a partition key for Record data type columns and repeated columns.
You can configure dynamic partitioning for Google BigQuery data object write operation. To configure dynamic partitioning, open the Google BigQuery data object write operation, and select the
Dynamic
partition type option on the
Runtime
tab.
When you configure dynamic partitioning, the Data Integration Service determines the number of partitions that it must create at run time. It scales the number of partitions based on factors such as the maximum parallelism value defined for the Data Integration Service and the mapping, and the number of CPUs available on the nodes where the mappings run.
When you configure the advanced properties in the data object write operation, you cannot use the
Write Disposition
>
Write truncate
option for multiple partitions. In such case, set the Maximum Parallelism property for the mapping to 1.
When you configure key range partitioning, you cannot use the create target option.

0 COMMENTS

We’d like to hear from you!