Table of Contents

Search

  1. Preface
  2. Part 1: Hadoop Integration
  3. Part 2: Databricks Integration
  4. Appendix A: Connections

Configure Storage Access

Configure Storage Access

Based on the cluster storage type, you can configure the storage key or get the client credentials of service principal to access the storage in the cluster. Add the configuration to the Spark configuration on the Databricks cluster.

Configure ADLS Storage Access

If you use ADLS storage, you need to set some Hadoop credential configuration options as Databricks Spark options. Add "spark.hadoop" as a prefix to the Hadoop configuration keys as shown in the following text:
spark.hadoop.dfs.adls.oauth2.access.token.provider.type ClientCredential spark.hadoop.dfs.adls.oauth2.client.id <your-service-client-id> spark.hadoop.dfs.adls.oauth2.credential <your-service-credentials> spark.hadoop.dfs.adls.oauth2.refresh.url "https://login.microsoftonline.com/<your-directory-id>/oauth2/token"

Configure WASB Storage Access

If you use WASB storage, you need to set a Hadoop configuration key based one of the following methods of access:
Account access key
If you use an account access key, add "spark.hadoop" as a prefix to the Hadoop configuration key as shown in the following text:
spark.hadoop.fs.azure.account.key.<your-storage-account-name>.blob.core.windows.net <your-storage-account-access-key>
SAS token
If you use an SAS token, add "spark.hadoop" as a prefix to the Hadoop configuration key as shown in the following text:
spark.hadoop.fs.azure.sas.<your-container-name>.<your-storage-account-name>.blob.core.windows.net <complete-query-string-of-your-sas-for-the-container>

0 COMMENTS

We’d like to hear from you!