Integration Guide

10.5.4
- 10.5.5
- 10.5.4.1
- 10.5.3
- 10.5.2
- 10.5.1
- 10.5
- 10.4.1
- 10.4.0
- 10.2.2 HotFix 1
- 10.2.2 Service Pack 1
- 10.2.2
- 10.2.1

Back Next

Create a Spark Staging Directory

When the Spark engine runs a job, it stores temporary files in a staging directory.

Optionally, create a staging directory on HDFS for the Spark engine. For example:

hadoop fs -mkdir -p /spark/staging

If you want to write the logs to the Informatica Hadoop staging directory, you do not need to create a Spark staging directory. By default, the Data Integration Service uses the HDFS directory

/tmp/SPARK_<user name>

Grant permission to the following users:

Hadoop impersonation user

SPN of the Data Integration Service

Mapping impersonation users

Optionally, you can assign -777 permissions on the directory.

If you create a staging directory on a CDP Data Hub cluster, grant Access Control List (ACL) permissions for the staging directory to the Hive user and the impersonation user. To grant ACL permissions, run the following command on the CDP Data Hub cluster:

hadoop fs -setfacl -m user:user:rwx <staging directory>

Rename Saved Search

Table of Contents

Integration Guide

Integration Guide

Create a Spark Staging Directory

Create a Spark Staging Directory