Table of Contents

Search

  1. Preface
  2. Introduction to Data Engineering Administration
  3. Authentication
  4. Running Mappings on a Cluster with Kerberos Authentication
  5. Authorization
  6. Cluster Configuration
  7. Cloud Provisioning Configuration
  8. Data Integration Service Processing
  9. Appendix A: Connections Reference
  10. Appendix B: Monitoring REST API

Data Engineering Administrator Guide

Data Engineering Administrator Guide

Azure Cloud Provisioning Configuration Properties

Azure Cloud Provisioning Configuration Properties

The properties in the Azure cloud provisioning configuration enable the Data Integration Service to contact and create resources on the Azure cloud platform.

Authentication Details

The following table describes authentication properties to configure:
Property
Description
Name
Name of the cloud provisioning configuration.
ID
ID of the cloud provisioning configuration. Default: Same as the cloud provisioning configuration name.
Description
Optional. Description of the cloud provisioning configuration.
Subscription ID
ID of the Azure account to use in the cluster creation process.
Tenant ID
A GUID string associated with the Azure Active Directory.
Client ID
A GUID string that is the same as the Application ID associated with the Service Principal. The Service Principal must be assigned to a role that has permission to create resources in the subscription that you identified in the Subscription ID property.
Client Secret
An octet string that provides a key associated with the client ID.

Storage Account Details

Choose to configure access to one of the following storage types:
The following table describes the information you need to configure Azure Data Lake Storage (ADLS) with the HDInsight cluster:
Property
Description
Azure Data Lake Store Name
Name of the ADLS storage to access. The ADLS storage and the cluster to create must reside in the same region.
Data Lake Service Principal Client ID
A credential that enables programmatic access to ADLS storage. Enables the Informatica domain to communicate with ADLS and run commands and mappings on the HDInsight cluster.
The service principal is an Azure user that meets the following requirements:
  • Permissions to access required directories in ADLS storage.
  • Certificate-based authentication for ADLS storage.
  • Key-based authentication for ADLS storage.
Data Lake Service Principal Certificate Contents
The Base64 encoded text of the public certificate used with the service principal.
Leave this property blank when you create the cloud provisioning configuration. After you save the cloud provisioning configuration, log in to the VM where the Informatica domain is installed and run
infacmd ccps updateADLSCertificate
to populate this property.
Data Lake Service Principal Certificate Password
Private key for the service principal. This private key must be associated with the service principal certificate.
Data Lake Service Principal Client Secret
An octet string that provides a key associated with the service principal.
Data Lake Service Principal OAUTH Token Endpoint
Endpoint for OAUTH token based authentication.
The following table describes the information you need to configure Azure General Storage, also known as blob storage, with the HDInsight cluster:
Property
Description
Azure Storage Account Name
Name of the storage account to access. Get the value from the Storage Accounts node in the Azure web console. The storage and the cluster to create must reside in the same region.
Azure Storage Account Key
A key to authenticate access to the storage account. To get the value from the Azure web console, select the storage account, then Access Keys. The console displays the account keys.

Cluster Deployment Details

The following table describes the cluster deployment properties that you configure:
Property
Description
Resource Group
Resource group in which to create the cluster. A resource group is a logical set of Azure resources.
Virtual Network Resource Group
Optional. Resource group to which the virtual network belongs.
If you do not specify a resource group, the Data Integration Service assumes that the virtual network is a member of the same resource group as the cluster.
Virtual Network
Name of the virtual network or vnet where you want to create the cluster. Specify a vnet that resides in the resource group that you specified in the Virtual Network Resource Group property.
The vnet must be in the same region as the region in which to create the cluster.
Subnet Name
Subnet in which to create the cluster. The subnet must be a part of the vnet that you designated in the previous property.
Each vnet can have one or more subnets. The Azure administrator can choose an existing subnet or create one for the cluster.

External Hive Metastore Details

You can specify the properties to enable the cluster to connect to a Hive metastore database that is external to the cluster.
If you do not specify an existing external database in this dialog box, the cluster creates its own database on the cluster. This database is terminated when the cluster is terminated.
You can use an external relational database like MySQL or Amazon RDS as the Hive metastore database. The external database must be on the same cloud platform as the cluster to create.
The following table describes the Hive metastore database properties that you configure:
Property
Description
Database Name
Name of the Hive metastore database.
Database Server Name
Server on which the database resides.
The database server name on the Azure web console commonly includes the suffix
database.windows.net
. For example:
server123xyz.database.windows.net
. You can specify the database server name without the suffix and Informatica will automatically append the suffix. For example, you can specify
server123xyz
.
Database User Name
User name of the account for the domain to use to access the database.
Database Password
Password for the user account.

0 COMMENTS

We’d like to hear from you!