Table of Contents

Search

  1. Preface
  2. Part 1: Introduction to Google BigQuery connectors
  3. Part 2: Data Integration with Google BigQuery V2 Connector
  4. Part 3: Data Integration with Google BigQuery Connector

Google BigQuery Connectors

Google BigQuery Connectors

Troubleshooting a mapping in advanced mode

Troubleshooting a mapping in advanced mode

Mapping configured to write Date and Int96 data types for Parquet file fails
A mapping configured to read from Google BigQuery source and write to a Parquet file in Google Cloud Storage target fails in the following cases:
  • Data is of the Date data type and the date is less than 1582-10-15.
  • Data is of the Int96 data type and the timestamp is less than 1900-01-01T00:00:00Z.
To resolve this issue, specify the following spark session properties in the mapping task or in the custom properties file for the Secure Agent:
  • spark.sql.parquet.int96RebaseModeInWrite=LEGACY
  • spark.sql.parquet.datetimeRebaseModeInWrite=LEGACY
  • spark.sql.parquet.int96RebaseModeInRead=LEGACY
  • spark.sql.parquet.datetimeRebaseModeInRead=LEGACY
  • spark.sql.avro.datetimeRebaseModeInWrite=LEGACY
  • spark.sql.avro.datetimeRebaseModeInRead=LEGACY
Time zone for the Date and Timestamp data type fields defaults to the Secure Agent host machine time zone.  
When you run a mapping in advanced mode to read from or write to fields of the Date and Timestamp data types, the time zone defaults to the Secure Agent host machine time zone.
To change the Date and Timestamp to the UTC time zone, you can either set the Spark properties globally in the Secure Agent directory for all the tasks in the organization that use this Secure Agent, or you can set the Spark session properties for a specific task from the task properties:  
To set the properties globally, perform the following tasks:
  1. Add the following properties to the
    <Secure Agent installation directory>/apps/At_Scale_Server/41.0.2.1/spark/custom.properties
    directory:
    • infacco.job.spark.driver.extraJavaOptions=-Duser.timezone=UTC
    • infacco.job.spark.executor.extraJavaOptions=-Duser.timezone=UTC
  2. Restart the Secure Agent.
To set the properties for a specific task, navigate to the Spark session properties in the task properties, and perform the following steps:
  • Select the session property name as
    spark.driver.extraJavaOptions
    and set the value to
    -Duser.timezone=UTC
    .
  • Select
    spark.executor.extraJavaOptions
    and set the value to
    -Duser.timezone=UTC
    .

0 COMMENTS

We’d like to hear from you!