Integration Guide

10.5.4
- 10.5.5
- 10.5.4.1
- 10.5.3
- 10.5.2
- 10.5.1
- 10.5
- 10.4.1
- 10.4.0
- 10.2.2 HotFix 1
- 10.2.2 Service Pack 1
- 10.2.2
- 10.2.1

Back Next

Hadoop Connection Properties

Use the Hadoop connection to configure mappings to run on a Hadoop cluster. A Hadoop connection is a cluster type connection. You can create and manage a Hadoop connection in the Administrator tool or the Developer tool. You can use infacmd to create a Hadoop connection. Hadoop connection properties are case sensitive unless otherwise noted.

You can configure run-time properties for the Hadoop environment in the Data Integration Service, the Hadoop connection, and in the mapping. You can override a property configured at a high level by setting the value at a lower level. For example, if you configure a property in the Data Integration Service custom properties, you can override it in the Hadoop connection or in the mapping. The Data Integration Service processes property overrides based on the following priorities:

Mapping custom properties set using

infacmd ms runMapping

with the

-cp

option

Mapping run-time properties for the Hadoop environment

Hadoop connection advanced properties for run-time engines

Hadoop connection advanced general properties, environment variables, and classpaths

Data Integration Service custom properties

When a mapping uses Hive Server 2 to run a job or parts of a job, you cannot override properties that are configured on the cluster level in preSQL or post-SQL queries or SQL override statements.

Workaround: Instead of attempting to use the cluster configuration on the domain to override cluster properties, pass the override settings to the JDBC URL. For example:

beeline -u "jdbc:hive2://<domain host>:<port_number>/tpch_text_100" --hiveconf hive.execution.engine=tez

Appendix B: Connections Reference

Hadoop Cluster Properties

Common Properties

Reject Directory Properties

Blaze Configuration

Spark Configuration

Download Guide

Watch

Comments

Communities

Knowledge Base