Shared Content for Data Engineering
All Products
Option
| Description
|
---|---|
connectionId
| String that the Data Integration Service uses to identify the connection. The ID is not case sensitive. It must be 255 characters or less and must be unique in the domain. You cannot change this property after you create the connection. Default value is the connection name.
|
connectionType
| Required. Type of connection is Hadoop.
|
name
| The name of the connection. The name is not case sensitive and must be unique within the domain. You can change this property after you create the connection. The name cannot exceed 128 characters, contain spaces, or contain the following special characters:
~ ` ! $ % ^ & * ( ) - + = { [ } ] | \ : ; " ' < , > . ? /
|
RMAddress
| The service within Hadoop that submits requests for resources or spawns YARN applications.
Use the following format:
<hostname>:<port>
Where
For example, enter:
myhostame:8032
You can also get the Resource Manager Address property from yarn-site.xml located in the following directory on the Hadoop cluster:
/etc/hadoop/conf/
The Resource Manager Address appears as the following property in yarn-site.xml:
Optionally, if the yarn.resourcemanager.address property is not configured in yarn-site.xml, you can find the host name from the
yarn.resourcemanager.hostname or
yarn.resourcemanager.scheduler.address properties in yarn-site.xml. You can then configure the Resource Manager Address in the Hadoop connection with the following value:
hostname:8032
|
cadiAppYarnQueueName
| The YARN scheduler queue name used by the Blaze engine that specifies available resources on a cluster. The name is case sensitive.
|
cadiExecutionParameterList
| Custom properties that are unique to the Blaze engine.
Use the following format:
<property1>=<value>
Where
To specify multiple properties use
&: as the property separator.
Use custom properties only at the request of Informatica Global Customer Support.
|
cadiMaxPort
| The maximum value for the port number range for the Blaze engine.
|
cadiMinPort
| The minimum value for the port number range for the Blaze engine.
|
cadiUserName
| The operating system profile user name for the Blaze engine.
|
cadiWorkingDirectory
| The HDFS file path of the directory that the Blaze engine uses to store temporary files. Verify that the directory exists. The YARN user, Blaze engine user, and mapping impersonation user must have write permission on this directory.
|
databaseName
| Namespace for tables. Use the name
default for tables that do not have a specified database name.
|
defaultFSURI
| The URI to access the default Hadoop Distributed File System.
Use the following connection URI:
hdfs://<node name>:<port>
Where
For example, enter:
hdfs://myhostname:8020/
You can also get the Default File System URI property from core-site.xml located in the following directory on the Hadoop cluster:
/etc/hadoop/conf/
Use the value from the
fs.defaultFS property found in core-site.xml.
For example, use the following value:
If the Hadoop cluster runs MapR, use the following URI to access the MapR File system:
maprfs:/// .
|
engineType
| The engine that the Hadoop environment uses to run a mapping on the Hadoop cluster. Select a value from the drop down list.
For example select:
MRv2
To set the engine type in the Hadoop connection, you must get the value for the
mapreduce.framework.name property from mapred-site.xml located in the following directory on the Hadoop cluster:
/etc/hadoop/conf/
If the value for
mapreduce.framework.name is
classic , select
mrv1 as the engine type in the Hadoop connection.
If the value for
mapreduce.framework.name is
yarn , you can select the
mrv2 or
tez as the engine type in the Hadoop connection. Do not select Tez if Tez is not configured for the Hadoop cluster.
You can also set the value for the engine type in hive-site.xml. The engine type appears as the following property in hive-site.xml:
|
environmentSQL
| SQL commands to set the Hadoop environment. The Data Integration Service executes the environment SQL at the beginning of each Hive script generated in a Hive execution plan.
The following rules and guidelines apply to the usage of environment SQL:
|
hadoopExecEnvExecutionParameterList
| Custom properties that are unique to the Hadoop environment.
Use the following format:
<property1>=<value>
Where
To specify multiple properties use
&: as the property separator.
Use custom properties only at the request of Informatica Global Customer Support.
|
hiveWarehouseDirectoryOnHDFS
| The absolute HDFS file path of the default database for the warehouse that is local to the cluster. For example, the following file path specifies a local warehouse:
/user/hive/warehouse
For Cloudera CDH, if the Metastore Execution Mode is remote, then the file path must match the file path specified by the Hive Metastore Service on the Hadoop cluster.
You can get the value for the Hive Warehouse Directory on HDFS from the
hive.metastore.warehouse.dir property in hive-site.xml located in the following directory on the Hadoop cluster:
/etc/hadoop/conf/
For example, use the following value:
For MapR,
hive-site.xml is located in the following direcetory:
/opt/mapr/hive/<hive version>/conf .
|
jobMonitoringURL
| The URL for the MapReduce JobHistory server. You can use the URL for the JobTracker URI if you use MapReduce version 1.
Use the following format:
<hostname>:<port>
Where
For example, enter:
myhostname:8021
You can get the value for the Job Monitoring URL from mapred-site.xml. The Job Monitoring URL appears as the following property in mapred-site.xml:
|
metastoreDatabaseDriver
| Driver class name for the JDBC data store. For example, the following class name specifies a MySQL driver:
com.mysql.jdbc.Driver
You can get the value for the Metastore Database Driver from hive-site.xml. The Metastore Database Driver appears as the following property in hive-site.xml:
|
metastoreDatabasePassword
| The password for the metastore user name.
You can get the value for the Metastore Database Password from hive-site.xml. The Metastore Database Password appears as the following property in hive-site.xml:
|
metastoreDatabaseURI
| The JDBC connection URI used to access the data store in a local metastore setup. Use the following connection URI:
jdbc:<datastore type>://<node name>:<port>/<database name>
where
For example, the following URI specifies a local metastore that uses MySQL as a data store:
jdbc:mysql://hostname23:3306/metastore
You can get the value for the Metastore Database URI from hive-site.xml. The Metastore Database URI appears as the following property in hive-site.xml:
|
metastoreDatabaseUserName
| The metastore database user name.
You can get the value for the Metastore Database User Name from hive-site.xml. The Metastore Database User Name appears as the following property in hive-site.xml:
|
metastoreMode
| Controls whether to connect to a remote metastore or a local metastore. By default, local is selected. For a local metastore, you must specify the Metastore Database URI, Metastore Database Driver, Username, and Password. For a remote metastore, you must specify only the
Remote Metastore URI .
You can get the value for the Metastore Execution Mode from hive-site.xml. The Metastore Execution Mode appears as the following property in hive-site.xml:
The
hive.metastore.local property is deprecated in hive-site.xml for Hive server versions 0.9 and above. If the
hive.metastore.local property does not exist but the
hive.metastore.uris property exists, and you know that the Hive server has started, you can set the connection to a remote metastore.
|
remoteMetastoreURI
| The metastore URI used to access metadata in a remote metastore setup. For a remote metastore, you must specify the Thrift server details.
Use the following connection URI:
thrift://<hostname>:<port>
Where
For example, enter:
thrift://myhostname:9083/
You can get the value for the Remote Metastore URI from hive-site.xml. The Remote Metastore URI appears as the following property in hive-site.xml:
|
stgDataCompressionCodecClass
| Codec class name that enables data compression and improves performance on temporary staging tables.
|
stgDataCompressionCodecType
| Hadoop compression library for a compression codec class name.
|
Updated February 12, 2021