Integration Guide

10.4.1
- 10.5.5
- 10.5.4.1
- 10.5.4
- 10.5.3
- 10.5.2
- 10.5.1
- 10.5
- 10.4.0
- 10.2.2 HotFix 1
- 10.2.2 Service Pack 1
- 10.2.2
- 10.2.1

Back Next

Create a Sqoop Staging Directory

When you run Sqoop jobs on the Spark engine, the Data Integration Service creates a Sqoop staging directory named

sqoop_staging

within the Spark staging directory by default. You can configure the Spark staging directory that you want to use in the Hadoop connection.

However, based on your processing requirements, you might need to create the directory manually and give write permissions to the Hive super user. When you create the

sqoop_staging

directory manually, the Data Integration Service uses this directory instead of creating another one.

Create a Sqoop staging directory named

sqoop_staging

manually in the following situations:

You run a Sqoop pass-through mapping on the Spark engine to read data from a Sqoop source and write data to a Hive target that uses the Text format.

You use a Cloudera CDH cluster with Sentry authorization, a Cloudera CDP cluster with Ranger authorization, or a Hortonworks HDP cluster with Ranger authorization.

After you create the

sqoop_staging

directory, you must add an Access Control List (ACL) for the

sqoop_staging

directory and grant write permissions to the Hive super user. Run the following command on the Cloudera CDH cluster or the Hortonworks HDP cluster to add an ACL for the

sqoop_staging

directory and grant write permissions to the Hive super user:

hdfs dfs -setfacl -m default:user:hive:rwx /

sqoop_staging

For information about Sentry authorization, see the Cloudera documentation. For information about Ranger authorization, see the Hortonworks documentation.

Rename Saved Search

Table of Contents

Integration Guide

Integration Guide

Create a Sqoop Staging Directory

Create a Sqoop Staging Directory