Table of Contents

Search

  1. Preface
  2. Introduction to Hive Connector
  3. Hive connections
  4. Mappings and mapping tasks with Hive Connector
  5. Migrating a mapping
  6. Data type reference
  7. Troubleshooting

Hive Connector

Hive Connector

Accessing multiple storage systems

Accessing multiple storage systems

Create an Hive connection to read data from or write data to Hive.
Use the DFS URI property in the connection parameters to connect to various storage systems. The following table lists the storage system and the DFS URI format for the storage system:
Storage
DFS URI Format
HDFS
hdfs://<namenode>:<port>
where:
  • <namenode>
    is the host name or IP address of the NameNode.
  • <port>
    is the port that the NameNode listens for remote procedure calls (RPC).
hdfs://<nameservice>
in case of NameNode high availability.
WASB in HDInsight
wasb://<container_name>@<account_name>.blob.core.windows.net/<path>
where:
  • <container_name>
    identifies a specific Azure Storage Blob container.
    <container_name>
    is optional.
  • <account_name>
    identifies the Azure Storage Blob object.
Not applicable for a Hive connection in mappings that run on the
advanced cluster
.
Amazon S3
s3a://home
Azure Data Lake Gen2 in HDInsight
abfss://<container name>@<storage name>.dfs.core.windows.net
where:
  • <container_name>
    identifies a specific Azure Data Lake Gen2 container.
  • <storage name>
    identifies the Azure Data Lake Gen2 storage account name.
Applicable to a Hive connection used in mappings configured in advanced mode.

0 COMMENTS

We’d like to hear from you!