Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Microsoft Azure Blob Storage
  3. PowerExchange for Microsoft Azure Blob Storage Configuration
  4. Microsoft Azure Blob Storage Connections
  5. Microsoft Azure Blob Storage Data Objects
  6. Microsoft Azure Blob Storage Mappings
  7. Data Type Reference

PowerExchange for Microsoft Azure Blob Storage User Guide

PowerExchange for Microsoft Azure Blob Storage User Guide

Prerequisites

Prerequisites

Before you use PowerExchange for Microsoft Azure Blob Storage, perform the following tasks:
  • Verify that the Hadoop Distribution Directory property in the developerCore.ini file is set based on the Hadoop distribution that you use.
  • To run a mapping, you must configure the INFA_PARSER_HOME environment variable for the Data Integration Service in Informatica Administrator. Set the value of the environment variable to the absolute path of the Hadoop distribution directory on the machine that runs the Data Integration Service.
When you import a data object in a mapping, do not use the MapR distribution.

Configure Databricks Connection Advanced Properties

Verify that a Databricks connection is created in the domain. If you want to read NULL values from or write NULL values to an Azure source, configure the following advanced properties in the Databricks connection:
  • infaspark.flatfile.reader.nullValue=True
  • infaspark.flatfile.writer.nullValue=True

Configure Azure Blob Storage Access in Azure Databricks Cluster

Verify that a cluster configuration is created in the domain. Set your Azure Blob Storage account name and account key under
Spark Config
in your Databricks cluster configuration to access the Azure Blob Storage. Add "spark.hadoop" as a prefix to the Hadoop configuration key as shown in the following text:
spark.hadoop.fs.azure.account.key.<your-storage-account-name>.blob.core.windows.net <your-storage-account-access-key>
In case of multiple Azure Blob Storage accounts, you must configure the account name and account key for each of the Azure Blob Storage account.

Configure Azure Blob Storage SAS Access in Azure Databricks Cluster

Verify that a cluster configuration is created in the domain. Set your Azure Blob Storage account name and SAS token under
Spark Config
in your Databricks cluster configuration to access the Azure Blob Storage. Add "spark.hadoop" as a prefix to the Hadoop configuration key as shown in the following text:
spark.hadoop.fs.azure.sas.<container-name>.<storage-account-name>.blob.core.windows.net <sas-token-for-your-blob-account>

Configure access to secure transfer-enabled storage accounts

Verify that the
Secure transfer required
option in the
Configuration
tab in your Azure Blob Storage account is enabled. In addition, set the following custom property for the Data Integration Service:
SecureTransferRequired=True
After you configure the custom property, restart the Data Integration Service.

0 COMMENTS

We’d like to hear from you!