Table of Contents

Search

  1. Preface
  2. Workflow Manager
  3. Workflows and Worklets
  4. Sessions
  5. Session Configuration Object
  6. Tasks
  7. Sources
  8. Targets
  9. Connection Objects
  10. Validation
  11. Scheduling and Running Workflows
  12. Sending Email
  13. Workflow Monitor
  14. Workflow Monitor Details
  15. Session and Workflow Logs
  16. Appendix A: Session Properties Reference
  17. Appendix B: Workflow Properties Reference

Workflow Basics Guide

Workflow Basics Guide

PowerExchange for Hadoop Connections

PowerExchange for Hadoop Connections

Use a Hadoop HDFS application connection object for each Hadoop source or target that you want to access.
You connect to a Hadoop cluster through an HDFS host that runs the name node service for a Hadoop cluster.
The following table describes the properties that you configure for a Hadoop HDFS application connection:
Property
Description
Name
The connection name used by the Workflow Manager. Connection name cannot contain spaces or other special characters, except for the underscore character.
User Name
The name of the user in the Hadoop group that is used to access the HDFS host.
Password
Password to access the HDFS host. Reserved for future use.
HDFS Connection URI
The URI to access HDFS. Use the value for the
fs.default.name
property for the NameNode URI. You can find the value for the property for the NameNode URI. You can find the value for the
fs.default.name
property in the
core-site.xml
configuration set.
Syntax for Hadoop distributions
:
hdfs://<namenode>:<port>
Where
  • <namenode> is the host name or IP address of the NameNode.
  • <port> is the port that the NameNode listens for remote procedure calls (RPC).
Syntax for the MapR distribution
:
maprfs:///
Syntax for the HDInsight distribution
:
  • adl:// <nameservices>
  • wasb://<nameservices>
Hive Driver Name
The name of the Hive driver.
By default, the driver name is:
org.apache.hive.jdbc.HiveDriver
Hive URL
The URL to the Hive host.
For MapR Ticket cluster, specify the URL in the following format:
jdbc:hive2://
hostname
:
portnumber
/default;auth=MAPRSASL/default;auth=MAPRSASL
For MapR Kerberos cluster, specify the URL in the following format:
jdbc:hive2://
hostname
:
portnumber
/default;auth=MAPRSASL/default;principal=<spn>
Hive User Name
The Hive user name. Reserved for future use.
Hive Password
The password for the Hive user. Reserved for future use.
Hadoop Distribution
The name of the Hadoop distribution.
Default is cloudera_cdh.

0 COMMENTS

We’d like to hear from you!