Table of Contents

Search

  1. Preface
  2. Introduction to Google Cloud Storage V2 Connector
  3. Google Cloud Storage V2 connections
  4. Mappings for Google Cloud Storage
  5. Migrating a mapping
  6. Upgrading to Google Cloud Storage V2 Connector
  7. Appendix A: Data type reference

Google Cloud Storage V2 Connector

Google Cloud Storage V2 Connector

Data compression in Google Cloud Storage V2 sources and targets

Data compression in Google Cloud Storage V2 sources
and targets

You can decompress the data when you read data from a Google Cloud Storage V2 source
and compress the data when you write data to a Google Cloud Storage V2 target
.
Configure the compression format in the
Compression Format
option under the advanced source
and target
properties.
The following table lists the supported compression formats in the source for different file formats:
Compression format
Avro File
Flat File
JSON File
Parquet File
Gzip
No
Yes
No
Yes
None
Yes
Yes
Yes
Yes
Select the None compression format if you want to use Deflate or Snappy compression format for Avro and Parquet file formats.
The following table lists the supported compression formats in the target for different file formats:
Compression format
Avro File
Flat File
JSON File
Parquet File
Deflate
Yes
No
No
No
Gzip
No
Yes
No
Yes
None
Yes
Yes
Yes
Yes
Snappy
Yes
No
No
Yes
To read a compressed file from Google Cloud Storage V2, the compressed file must have specific extensions. If the extensions used to read the compressed file are not valid, the Secure Agent does not process the file. The following table describes the extensions that are appended based on the compression format that you use:
Compression format
File Name Extension
Deflate
.deflate
Gzip
.GZ
Snappy
.snappy
Use the following guidelines when you configure data compression:
  • Data compression is supported at the file level. You cannot use data compression for a directory.
  • When
    Is Directory
    property is selected at source, the files within the directory are read sequentially.
  • When you download a compressed Gzip file for the Google Cloud Platform console, uncompressed file is downloaded by default. To download the compressed file, you need to remove the content encoding metadata of the object manually. Select
    Edit object metadata
    of the object and remove
    Gzip
    from the
    Content-Encoding
    field.
  • When you configure Gzip compression format in the target and the mapping fails with a Java heap space error, update the staging optimization memory in the JVMOptions property to
    -Xmx2048m
    and
    -Xms512m
    . Google Cloud Storage requires a buffer size of 15 MB to upload the compressed files.

0 COMMENTS

We’d like to hear from you!