Table of Contents

Search

  1. Preface
  2. Getting Started with Informatica Address Verification (On-Premises)
  3. General Settings
  4. Input Parameters
  5. Process Parameters
  6. Address Enrichments
  7. Result Parameters
  8. Output Fields
  9. Assessment Codes and Return Codes
  10. Frequently Asked Questions
  11. Geocode Countries

Developer Guide (On-Premises)

Developer Guide (On-Premises)

Character Set Mapping

Character Set Mapping

The character set mapping provides a mapping between characters in source and destination character sets and thus enables conversion between character sets. Informatica Address Verification internally uses Unicode and externally supports multiple character sets including UTF-8, ISO 8859-1, GBK, BIG5, JIS, and EBCDIC.
Character sets use a numeric representation for each of the supported alphabets or characters. Typically, character sets use the same numeric representation for common alphabets or characters. However, some of the language-specific characters have different numeric representations across character sets.
For example, the letter A has the same numeric representation, 65, in both Unicode and Latin character sets. However, the letter Å has different representations in Unicode and Latin character sets. Å is represented by 143 in Unicode and 197 in Latin character sets. Characters that have different numeric representations across character sets fail to appear correctly when you use different character sets to render the data.
To render character sets, Address Verification first converts the input character strings to Unicode. Then it uses the corresponding mapping of the destination character set to render the data with near perfection. If no representation is available for a character in the destination character set, Address Verification maps that character to an underscore character.
Submit addresses to Address Verification in a single character set.