Amazon Redshift Connector Best Practices

Amazon Redshift Connector Best Practices

Rules and Guildelines for Connecting to Amazon Redshift

Rules and Guildelines for Connecting to Amazon Redshift

Consider the following rules and guidelines when you connect to Amazon Redshift using Informatica Cloud, Big Data Management, and PowerCenter:
  • Amazon Redshift Connector, PowerExchange for Amazon Redshift, and PowerExchange for Amazon Redshift for PowerCenter does not enforce primary key constraints.
  • Do not use Amazon Redshift lookups that are not cached.
    If a lookup is not cached, each lookup action is performed in the Amazon Redshift table for each record that is processed. If a lookup is cached, the records for lookup are loaded in memory all at once avoiding the record level.
  • Do not use the update override option that are issued by record. Updating an Amazon Redshift table is more efficient when the data are batched together. When you run a mapping to update an Amazon Redshift table, the tables are updated in batches. Instead, of overriding the updates, you must try to translate the logic that either maps the logic or splits the mapping to explore.
  • Do not use the functions and expressions that are not support for pushdown optimization.
  • Specify the distribution and sort keys on all tables.
  • Ensure that the Amazon Redshift columns do not contain null values.
  • The distribution and sort keys are more effective in Amazon Redshift compared to the traditional primary key constraints. In many cases, the existing primary keys from the table could be the right choice for the sort keys. However, you must refer to the Amazon Redshift documentation to ensure that these keys are configured in a way that the keys are most suited for the analysis.

0 COMMENTS

We’d like to hear from you!