Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Microsoft Azure Blob Storage
  3. PowerExchange for Microsoft Azure Blob Storage Configuration
  4. Microsoft Azure Blob Storage Connections
  5. Microsoft Azure Blob Storage Data Objects
  6. Microsoft Azure Blob Storage Mappings
  7. Data Type Reference

PowerExchange for Microsoft Azure Blob Storage User Guide

PowerExchange for Microsoft Azure Blob Storage User Guide

Configuring Lzo Compression Format

Configuring Lzo Compression Format

To write the
.jar
files in the lzo compression format on the Spark and Databricks engines, you must copy the
.jar
files for the lzo compression on the machine on which the Data Integration Service runs.
For the Spark engine, perform the following steps to copy the
.jar
files from the distribution directory to the Data Integration Service:
  1. Copy the
    lzo.jar
    file from the cluster to the following directories on the machine on which the Data Integration Service runs:
    <Informatica installation directory>/<distribution>/infaLib
  2. Copy the lzo native binaries from the cluster to the following directory on the machine on which the Data Integration Service runs:
    <Informatica installation directory>/<distribution>/lib/native
  3. In the Administrator Console, navigate to the Data Integration Service.
    The Data Integration Service page appears.
  4. Click the
    Processes
    tab.
    The
    Processes
    page appears.
  5. Click the pencil icon to edit the environment variables in the
    Environment Variables
    section.
    The
    Edit Environment Variables
    dialog box appears.
  6. Click
    New
    to add a new environment variable.
    The
    New Environment Variables
    dialog box appears.
  7. Enter the value of the
    Name
    field as
    LD_LIBARY_PATH
    .
  8. Enter the following path in the
    Value
    field:
    <infahome>/services/shared/bin:/<infahome>/services/shared/Hadoop/<distributionType>/lib/native
  9. Restart the Data Integration Service for changes to take effect.
For the Databricks engine, perform the following steps to copy the
.jar
files from the distribution directory to the Data Integration Service:
  1. Copy the
    lzo.jar
    file from the cluster to the following directory on the machine on which the Data Integration Service runs:
    <Informatica installation directory>/services/shared/hadoop/Databricks_<version>/runtimeLib
  2. Configure Spark Config in your Databricks cluster configuration to use the Lzo compression codec. The following snippet shows the sample configuration:
    spark.hadoop.io.compression.codecs "org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.BZip2Codec
  3. Restart the Data Integration Service for changes to take effect.
For more information, see Microsoft Azure Databricks documentation.

0 COMMENTS

We’d like to hear from you!