Table of Contents

Search

  1. Preface
  2. Data Replication Overview
  3. Understanding Data Replication
  4. Sources - Preparation and Replication Considerations
  5. Targets - Preparation and Replication Considerations
  6. Starting the Server Manager
  7. Getting Started with the Data Replication Console
  8. Defining and Managing Server Manager Main Servers and Subservers
  9. Creating and Managing User Accounts
  10. Creating and Managing Connections
  11. Creating Replication Configurations
  12. Materializing Targets with InitialSync
  13. Scheduling and Running Replication Tasks
  14. Implementing Advanced Replication Topologies
  15. Monitoring Data Replication
  16. Managing Replication Configurations
  17. Handling Replication Environment Changes and Failures
  18. Troubleshooting
  19. Data Replication Files and Subdirectories
  20. Data Replication Runtime Parameters
  21. Command Line Parameters for Data Replication Components
  22. Updating Configurations in the Replication Configuration CLI
  23. DDL Statements for Manually Creating Recovery Tables
  24. Sample Scripts for Enabling or Disabling SQL Server Change Data Capture
  25. Glossary

Database Character Set Conversion

Database Character Set Conversion

To accurately replicate character data, verify character set settings for the source and target databases.
Data Replication can use the International Components for Unicode (ICU) library to convert character data from the source database encoding to the target database encoding. Data Replication supports character set conversion for configurations that have a DB2 for Linux, UNIX, and Windows, Microsoft SQL Server, MySQL, or Oracle source and an Amazon Redshift, Greenplum, Netezza
, Vertica
, or Teradata target. When you create a replication configuration, the Data Replication Console queries the source and target databases to determine the source and target character sets and then writes these character set names to the replication configuration.
If the NLS_LANG environment variable is defined on the Oracle source, InitialSync uses this variable to determine the source database character set. If the NLS_LANG environment variable is not defined, InitialSync uses the character set from the replication configuration. The Applier always uses the source and target character sets from the replication configuration.
Virtual columns do not have the character set property. Instead, a virtual column uses the character set of the mapped source table to which you add the column. If the character set is not defined for the source table, the virtual column uses the character set that is defined for the source schema or database.
  • Data Replication does not support character set conversion for configurations that include Tcl scripts.
  • Data Replication does not support character set conversion for constants that are used in SQL expressions.
  • Data Replication does not support character set conversion for non-Latin characters in source database object names.
If the source character data includes only Latin characters but the source and target databases use incompatible character sets, Informatica recommends that you disable character set conversion to avoid performance degradation. To disable character set conversion, set the
global.icu_enabled
runtime parameter to 0. For example, disable character set conversion if the source character set is UTF-8 and the target character set is Latin 9.
If the source character data includes non-Latin characters and the source and target databases use incompatible character sets, Data Replication ends with an error regardless of the
global.icu_enabled
setting.
For configurations that have sources other than DB2 for Linux, UNIX, and Windows, Microsoft SQL Server, MySQL, or Oracle and that have targets other than Amazon Redshift, Greenplum, Netezza
, Vertica
, or Teradata, the Extractor can convert source character data only from UTF-16 to UTF-8 encoding. For other source character encodings, the Extractor writes extracted change data to intermediate files in the original source character set. The Applier does not convert the character set of the change data when applying the data to the target. In this case, Data Replication requires the source and target databases to use the same character set.
For configurations with Oracle sources and Oracle targets, you can configure the Oracle target databases to convert the change data that the Applier and InitialSync load to the target character set. Define the
NLS_LANG
parameter on the systems where the Applier and InitialSync run. To accurately replicate data, set this parameter value to match the source database character set. In the Data Replication Console, create an environment variables list and add the
NLS_LANG
variable to it. Then assign this environment variables list to the Server Manager that runs the Applier or InitialSync. Oracle performs the conversion if the
NLS_LANG
environment variable value does not match the
NLS_CHARACTERSET
setting of the target database.

0 COMMENTS

We’d like to hear from you!