Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Amazon S3
  3. PowerExchange for Amazon S3 Configuration Overview
  4. Amazon S3 Connections
  5. PowerExchange for Amazon S3 Data Objects
  6. PowerExchange for Amazon S3 Mappings
  7. PowerExchange for Amazon S3 Lookups
  8. Appendix A: Amazon S3 Data Type Reference
  9. Appendix B: Troubleshooting

PowerExchange for Amazon S3 User Guide

PowerExchange for Amazon S3 User Guide

Configure Databricks Cluster

Configure Databricks Cluster

Set the access key ID and secret access key values under Spark Config in your Databricks cluster configuration to access Amazon S3 storage. You must specify one key value pair per line and each key value pair must be separated by a single space.
spark.hadoop.fs.s3a.awsAccessKeyId xxyyzz spark.hadoop.fs.s3a.awsSecretAccessKey xxxyyyzzz

Access using IAM role

Optional. Create an IAM role associated with the AWS account of the Databricks deployment. Amazon S3 bucket must belong to the same account associated with the Databricks deployment. If the bucket belongs to a different AWS account, then, the Cross-Account bucket policy must be enabled to access the bucket.

Server-side S3 encryption (AES-256)

Optional. Set the
server-side-encryption-algorithm
property under Spark Config in your Databricks cluster configuration:
spark.hadoop.fs.s3a.server-side-encryption-algorithm AES256

Server-side encryption using SSE-KMS

Optional. Set the following properties under Spark Config in your Databricks cluster configuration:
spark.hadoop.fs.s3a.server-side-encryption-kms-master-key-id arn:aws:kms:us-west-XX:key/XXXYYYYYYY spark.hadoop.fs.s3a.server-side-encryption-algorithm aws:kms spark.hadoop.fs.s3a.impl com.databricks.s3a.S3AFileSystem

0 COMMENTS

We’d like to hear from you!