Connections for INFACore

Connections for INFACore

Amazon Athena

Amazon Athena

Create an Amazon Athena connection to connect to read data from Amazon Athena tables and views.

Feature snapshot

Operation
Support
Read
Yes
Write
No

Before you begin

You can use an Amazon Athena connection after the organization administrator performs the following tasks:
  • Manages authentication by creating an access key and a secret key. The access and secret keys are required when you configure an Amazon Athena connection.
  • Creates an AWS Key Management Service (AWS KMS)-managed customer master key if you want to enable server-side encryption or client-side encryption.
  • Creates the minimal Amazon Identity and Access Management (IAM) policy, AWS Glue data catalog policy, and Amazon Athena policy for an Amazon Athena connection.
Create a minimal Amazon IAM policy
Create an Amazon IAM policy and define the permissions to store Amazon Athena results on Amazon S3.
Use the following minimum required permissions to store Amazon Athena results on Amazon S3:
  • PutObject
  • GetObject
  • DeleteObject
  • ListBucket
You can use the following sample Amazon IAM policy:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::<bucket_name>/*", "arn:aws:s3:::<bucket_name>" ] } ] }
Create an AWS Glue data catalog policy
You can use AWS IAM to define policies and roles that are needed to access resources used by AWS Glue.
You can use the following sample policy for AWS Glue data catalog:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "glue:*", ], "Resource": [ "*" ] } ] }
Create an Amazon Athena policy
Specify the minimum required permissions for Amazon Athena Connector to read data from views and external tables in the AWS Glue data catalog and to read and query Amazon S3 files.
You can use the following minimum required permissions:
  • GetWorkGroup
  • GetTableMetadata
  • StartQueryExecution
  • GetQueryResultsStream
  • ListDatabases
  • GetQueryExecution
  • GetQueryResults
  • GetDatabase
  • ListTableMetadata
  • GetDataCatalog
You can use the following sample policy for Amazon Athena:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "athena:GetWorkGroup", "athena:GetTableMetadata", "athena:StartQueryExecution", "athena:GetQueryResultsStream", "athena:ListDatabases", "athena:GetQueryExecution", "athena:GetQueryResults", "athena:GetDatabase", "athena:ListTableMetadata", "athena:GetDataCatalog" ], "Resource": [ "arn:aws:athena:*:*:workgroup/*", "arn:aws:athena:*:*:datacatalog/*" ] }, { "Effect": "Allow", "Action": [ "athena:ListDataCatalogs", "athena:ListWorkGroups" ], "Resource": "*" } ] }

Connection properties

The following table describes the Amazon Athena connection properties:
Connection property
Description
Connection Name
Name of the connection.
Each connection name must be unique within the organization. Connection names can contain alphanumeric characters, spaces, and the following special characters: _ . + -,
Maximum length is 255 characters.
Authentication Type
The authentication mechanism to connect to Amazon Athena.
Select
Permanent IAM Credentials
.
Access Key
Optional. The access key to connect to Amazon Athena.
Secret Key
Optional. The secret key to connect to Amazon Athena.
JDBC URL
The URL of the Amazon Athena connection.
Enter the JDBC URL in the following format:
jdbc:awsathena://AwsRegion=<region_name>;S3OutputLocation=<S3_Output_Location>;
You can use pagination to fetch the Amazon Athena query results. Set the property
UseResultsetStreaming=0
to use pagination.
Enter the property in the following format:
jdbc:awsathena://AwsRegion=<region_name>;S3OutputLocation=<S3_Output_Location>;UseResultsetStreaming=0;
You can also use streaming to improve the performance and fetch the Amazon Athena query results faster. When you use streaming, ensure that port 444 is open.
By default, streaming is enabled.
Customer Master Key ID
Optional. Specify the customer master key ID generated by AWS Key Management Service (AWS KMS) or the Amazon Resource Name (ARN) of your custom key for cross-account access.
You must generate the customer master key ID for the same region where your Amazon S3 bucket resides. You can either specify the customer-generated customer master key ID or the default customer master key ID.

Read properties

The following table describes the advanced source properties that you can configure in the Python code to read from Amazon Athena:
Property
Description
Retain Athena Query Result On S3 File
Specifies whether you want to retain the Amazon Athena query result on the Amazon S3 file. Select the check box to retain the Amazon Athena query result on the Amazon S3 file.
The Amazon Athena query result in stored in the CSV file format.
By default, the
Retain Athena Query Result on S3 File
check box is not selected.
S3OutputLocation
Specifies the location of the Amazon S3 file that stores the result of the Amazon Athena query.
You can also specify the Amazon S3 file location in the
S3OutputLocation
parameter in the
JDBC URL
connection property.
If you specify the Amazon S3 output location in both the connection and the advanced source properties, the Secure Agent uses the Amazon S3 output location specified in the advanced source properties.
Fetch Size
Determines the number of rows to read in one result set from Amazon Athena.
Default is 10000.
Encryption Type
Encrypts the data in the Amazon S3 staging directory.
You can select the following encryption types:
  • None
  • SSE-S3
  • SSE-KMS
  • CSE-KMS
Default is None.
Schema Name
Overrides the schema name of the source object.
Source Table Name
Overrides the table name used in the metadata import with the table name that you specify.
SQL Query
Overrides the default SQL query.
Enclose column names in double quotes. The SQL query is case sensitive. Specify an SQL statement supported by the Amazon Athena database.
When you specify the columns in the SQL query, ensure that the column name in the query matches the source column name in the mapping.

0 COMMENTS

We’d like to hear from you!