PowerExchange for Microsoft Azure Blob Storage User Guide

10.4.0
- 10.5.9
- 10.5.8
- 10.5.7
- 10.5.6
- 10.5.4
- 10.5.3
- 10.5.1
- 10.5
- 10.4.1

Back Next

Configuring Lzo Compression Format

To write the

.jar

files in the lzo compression format on the Spark and Databricks engines, you must copy the

.jar

files for the lzo compression on the machine on which the Data Integration Service runs.

For the Spark engine, perform the following steps to copy the

.jar

files from the distribution directory to the Data Integration Service:

Copy the

lzo.jar

file from the cluster to the following directories on the machine on which the Data Integration Service runs:

<Informatica installation directory>/<distribution>/infaLib

Copy the lzo native binaries from the cluster to the following directory on the machine on which the Data Integration Service runs:

<Informatica installation directory>/<distribution>/lib/native

In the Administrator Console, navigate to the Data Integration Service.

The Data Integration Service page appears.

Click the

Processes

tab.

The

Processes

page appears.

Click the pencil icon to edit the environment variables in the

Environment Variables

section.

The

Edit Environment Variables

dialog box appears.

Click

New

to add a new environment variable.

The

New Environment Variables

dialog box appears.

Enter the value of the

Name

field as

LD_LIBARY_PATH

Enter the following path in the

Value

field:

<infahome>/services/shared/bin:/<infahome>/services/shared/Hadoop/<distributionType>/lib/native

Restart the Data Integration Service for changes to take effect.

For the Databricks engine, perform the following steps to copy the

.jar

files from the distribution directory to the Data Integration Service:

Copy the

lzo.jar

file from the cluster to the following directory on the machine on which the Data Integration Service runs:

<Informatica installation directory>/services/shared/hadoop/Databricks_<version>/runtimeLib

Configure Spark Config in your Databricks cluster configuration to use the Lzo compression codec. The following snippet shows the sample configuration:

spark.hadoop.io.compression.codecs "org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec

Restart the Data Integration Service for changes to take effect.

For more information, see Microsoft Azure Databricks documentation.

Data Compression in Microsoft Azure Blob Storage Sources and Targets

Download Guide

Watch

Comments

Communities

Knowledge Base

Success Portal