Table of Contents

Search

  1. Preface
  2. Introduction to PowerExchange for Amazon Redshift
  3. PowerExchange for Amazon Redshift Configuration
  4. Amazon Redshift Sources and Targets
  5. Amazon Redshift Mappings
  6. Amazon Redshift Pushdown Optimization
  7. Amazon Redshift Sessions
  8. Appendix A: Amazon Redshift Data Type Reference
  9. Appendix B: Troubleshooting

PowerExchange for Amazon Redshift User Guide for PowerCenter

PowerExchange for Amazon Redshift User Guide for PowerCenter

Amazon Redshift Target Session Configuration

Amazon Redshift Target Session Configuration

You can configure a session to write data to Amazon Redshift. Define the properties for each target instance in the session.
The following table describes the target session properties:
Property
Description
S3 Bucket Name
Amazon S3 bucket name for the Amazon Redshift target data.
Use an S3 bucket in the same region as your Amazon Redshift cluster.
Enable Compression
Compresses staged files before writing the files to Amazon Redshift.
Session performance improves when the PowerCenter Integration Service compresses the staged files.
Default is selected.
Staging Directory Location
Amazon Redshift staging directory.
Specify a directory on the machine that hosts the PowerCenter Integration Service.
Batch Size
Minimum number of rows in a batch. Enter a number greater than 0.
Default is 1000.
Configure the commit interval in the
Session
Properties
when you configure the batch size. Specify the same value for commit interval that you specified for the batch size.
Max Redshift Errors per Upload Batch for INSERT
Number of errors within a batch that causes a batch to fail. Enter a positive integer.
If the number of errors is equal to or greater than the property value, the PowerCenter Integration Service writes the entire batch to the error file.
Default is 1.
Truncate Target Table Before Data Load
Truncates an Amazon Redshift target table before writing data to the target.
Null value for CHAR and VARCHAR data types
String value used to represent null values in CHAR and VARCHAR fields in Amazon Redshift targets, such as NULL or a space character.
Default is an empty string.
Wait time in seconds for file consistency on S3
Number of seconds to wait for the PowerCenter Integration Service to make the staged files consistent with the list of files available on Amazon S3.
Default is 5.
CopyOptions Property File
Name of the property file.
Enables you to add additional options to the copy command for writing data from an Amazon S3 source to an Amazon Redshift target when the default delimiter comma (,) or double-quote (") is used in the data.
You can add the following options:
  • DELIMITER
  • ACCEPTINVCHARS
  • QUOTE
  • COMPUPDATE
  • AWS_IAM_ROLE
Either specify the path of the property file that contains the copy options or specify the copy options directly in the
CopyOptions Property File
field.
Specify a directory on the machine that hosts the PowerCenter Integration Service.
Turn on S3 Server Side Encryption
Indicates that Amazon S3 encrypts data during upload and decrypts data at the time of access.
Turn on S3 Client Side Encryption
Indicates that the PowerCenter Integration Service encrypts data by using a private encryption key.
If you enable both server side and client side encryption, the PowerCenter Integration Service ignores the server side encryption.
Vacuum Target Table
Recovers disk space and sorts rows in a specified table or all tables in the database.
You can select the following recovery options:
None
Does not sort rows or recover disk space.
Full
Sorts the specified table or all tables in the database and recovers disk space occupied by rows marked for deletion by previous update and delete operations.
Sort Only
Sorts the specified table or all tables in the database without recovering space freed by deleted rows.
Delete Only
Recovers disk space occupied by rows marked for deletion by previous update and delete operations, and compresses the table to free up used space.
Reindex
Analyzes the distribution of the values in the interleaved sort key columns to configure the entire Vacuum table operations for a better performance.
Default is None.
Analyze Target Table
Improve the efficiency of the write operations.
The query planner on Amazon Redshift updates the statistical metadata to build and choose optimal plans to improve the efficiency of queries.
Prefix for Retaining Staging Files on S3
Retains staging files on Amazon S3.
Provide both a directory prefix and a file prefix separated by a slash (/) or only a file prefix to retain staging files on Amazon S3. For example,
backup_dir/backup_file
or
backup_file
.
INSERT
If enabled, the PowerCenter Integration Service inserts all rows flagged for insert. If disabled, the PowerCenter Integration Service rejects the rows flagged for insert.
By default, the insert operation is enabled.
DELETE
If enabled, the PowerCenter Integration Service deletes all rows flagged for delete. If disabled, the PowerCenter Integration Service rejects all rows flagged for delete.
To perform a delete operation, you must map the primary key column and at least one column other than primary key column.
UPDATE
Performs update and upsert operations. To perform an update operation, you must map the primary key column and at least one column other than primary key column.
You can select the following session target attributes:
Update as Update
The PowerCenter Integration Service updates all rows as updates.
Update as Insert
The PowerCenter Integration Service updates existing rows and inserts other rows as if marked for insert.
In the session properties, if you set
Update
as the value of the
Treat source as rows
property and select the
Update as Insert
option in the target session properties, the mapping runs successfully. However, the PowerCenter Integration Service rejects the data.
Success File Directory
Directory for the Amazon Redshift success file.
Specify a directory on the machine that hosts the PowerCenter Integration Service.
Error File Directory
Directory for the Amazon Redshift error file.
Specify a directory on the machine that hosts the PowerCenter Integration Service.
Pre-SQL
The pre-SQL commands to run a query before you read data from Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Post-SQL
The post-SQL commands to run a query after you write data to Amazon Redshift. You can also use the UNLOAD or COPY command. The command you specify here is processed as a plain text.
Target Table Name
You can override the default target table name.
Part Size
Specifies the part size of an object.
Default is 5 MB.
TransferManager Thread Pool Size
Specifies the number of the threads to write data in parallel.
Default is 10.
Number of Files per Batch
Calculates the number of the staging files per batch.
If you do not provide the number of files, PowerExchange for Amazon Redshift calculates the number of the staging files.

0 COMMENTS

We’d like to hear from you!