When the Spark engine runs a job, it stores temporary files in a staging directory.
Optionally, create a staging directory on HDFS for the Spark engine. For example:
hadoop fs -mkdir -p /spark/staging
If you want to write the logs to the Informatica Hadoop staging directory, you do not need to create a Spark staging directory. By default, the Data Integration Service uses the HDFS directory
/tmp/SPARK_<user name>
.
Grant permission to the following users:
Hadoop impersonation user
SPN of the Data Integration Service
Mapping impersonation users
Optionally, you can assign -777 permissions on the directory.
If you create a staging directory on a CDP Data Hub cluster, grant Access Control List (ACL) permissions for the staging directory to the Hive user and the impersonation user. To grant ACL permissions, run the following command on the CDP Data Hub cluster: