Table of Contents

Search

  1. Preface
  2. Part 1: Introduction to Amazon Redshift connectors
  3. Part 2: Data Integration with Amazon Redshift V2 Connector
  4. Part 3: Data Integration with Amazon Redshift Connector

Amazon Redshift Connectors

Amazon Redshift Connectors

Amazon Redshift V2 sources in mappings

Amazon Redshift V2 sources in mappings

In a mapping, you can configure a Source transformation to represent an Amazon Redshift V2 source.
The following table describes the Amazon Redshift V2 source properties that you can configure in a Source transformation:
Property
Description
Connection
Name of the source connection. Select a source connection, or click
New Parameter
to define a new parameter for the source connection.
Source type
Type of the source object.
Select any of the following source object:
  • Single Object
  • Multiple Objects. You can use implicit joins and advanced relationships with multiple objects.
  • Query. When you select the source type as query, you must map all the fields selected in the query in the
    Field Mapping
    tab.
  • Parameter
You cannot override the source query object and multiple objects at runtime using parameter files in a mapping.
When you select the source type as query, the boolean values are written as 0 or false to the target.
The query that you specify must not end with a semicolon (;).
Object
Name of the source object.
You can select single or multiple source objects.
Parameter
Select an existing parameter for the source object or click
New Parameter
to define a new parameter for the source object. The
Parameter
property appears only if you select Parameter as the source type. If you want to overwrite the parameter at runtime, select the
Overwrite Parameter
option.
Filter
Filters records based on the filter condition.
You can specify a simple filter or an advanced filter.
Sort
Sorts records based on the conditions you specify. You can specify the following sort conditions:
  • Not parameterized. Select the fields and type of sorting to use.
  • Parameterized. Use a parameter to specify the sort option.
The following table describes the Amazon Redshift V2 advanced source properties that you can configure in a Source transformation:
Property
Description
Read Mode
Specifies the read mode to read data from the Amazon Redshift source.
You can select one of the following read modes:
  • Direct
    1
    . Reads data directly from the Amazon Redshift source without staging the data in Amazon S3.
  • Staging. Reads data from the Amazon Redshift source by staging the data in the S3 bucket.
Default is Staging.
Fetch Size
1
Determines the number of rows to read in one resultant set from Amazon Redshift. Applies only when you select the
Direct
read mode.
Default is 10000.
If you specify fetch size 0 or if you don't specify a fetch size, the entire data set is read directly at the same time than in batches.
S3 Bucket Name
*
Amazon S3 bucket name for staging the data.
You can also specify the bucket name with the folder path. If you provide an Amazon S3 bucket name that is in a different region than the Amazon Redshift cluster, you must configure the
REGION
attribute in the Unload command options.
Enable Compression
*
Compresses the staging files into the Amazon S3 staging directory.
The task performance improves when the Secure Agent compresses the staging files. Default is selected.
Staging Directory Location
1
*
Location of the local staging directory.
When you run a task in Secure Agent runtime environment, specify a directory path that is available on the corresponding Secure Agent machine in the runtime environment.
Specify the directory path in the following manner:
<staging directory>
For example,
C:\Temp
. Ensure that you have the write permissions on the directory.
Unload Options
*
Unload command options.
Add options to the Unload command to extract data from Amazon Redshift and create staging files on Amazon S3. Provide an Amazon Redshift Role Amazon Resource Name (ARN).
You can add the following options:
  • DELIMITER
  • ESCAPE
  • PARALLEL
  • NULL
    1
  • AWS_IAM_ROLE
  • REGION
  • ADDQUOTES
For example: DELIMITER = \036;ESCAPE = OFF;NULL=text;PARALLEL = ON;AWS_IAM_ROLE=arn;aws;iam;;<account ID>;role/<role-name>;REGION = ap-south-1
Specify a directory on the machine that hosts the Secure Agent.
If you do not add the options to the Unload command manually, the Secure Agent uses the default values.
Treat NULL Value as NULL
*
Retains the null values when you read data from Amazon Redshift.
Encryption Type
*
Encrypts the data in the Amazon S3 staging directory.
You can select the following encryption types:
  • None
  • SSE-S3
  • SSE-KMS
  • CSE-SMK
    1
Default is None.
Download S3 Files in Multiple Parts
1
*
Downloads large Amazon S3 objects in multiple parts.
When the file size of an Amazon S3 object is greater than 8 MB, you can choose to download the object in multiple parts in parallel.
Default is 5 MB.
Multipart Download Threshold Size
1
*
The maximum threshold size to download an Amazon S3 object in multiple parts.
Default is 5 MB.
Schema Name
Overrides the default schema name.
You cannot configure a custom query when you use the schema name.
Source Table Name
Overrides the default source table name.
Ensure that the metadata and column order in the override table match those in the source table imported during design time.
When you select the source type as
Multiple Objects
or
Query
, you cannot use the
Source Table Name
option.
Pre-SQL
The pre-SQL commands to run a query before you read data from Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Post-SQL
The post-SQL commands to run a query after you write data to Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Select Distinct
Selects unique values.
The Secure Agent includes a
SELECT DISTINCT
statement if you choose this option. Amazon Redshift ignores trailing spaces. Therefore, the Secure Agent might extract fewer rows than expected.
If you select the source type as query or use the
SQL Query
property and select the
Select Distinct
option, the Secure Agent ignores the
Select Distinct
option.
SQL Query
Overrides the default SQL query.
Enclose column names in double quotes. The SQL query is case sensitive. Specify an SQL statement supported by the Amazon Redshift database.
When you specify the columns in the SQL query, ensure that the column name in the query matches the source column name in the mapping.
Temporary Credential Duration
The time duration during which an IAM user can use the dynamically generated temporarily credentials to access the AWS resource. Enter the time duration in seconds.
Default is 900 seconds.
If you require more than 900 seconds, you can set the time duration up to a maximum of 12 hours in the AWS console and then enter the same time duration in this property.
Tracing Level
Use the verbose tracing level to get the amount of detail that appears in the log for the Source transformation.
1
Does not apply to mappings in advanced mode.
*
Does not apply to direct read mode.

0 COMMENTS

We’d like to hear from you!