On-Premises Developer Guide

6.1.0
- 6.5.0
- 6.4.0
- 6.3.0
- 6.2.0
- 6.0.0

Back Next

Transliteration

Transliteration converts data between non-Roman characters and Latin characters. Transliteration can also replace diacritical and extended characters with plain text equivalents.

Transliteration helps nonnative speakers find an approximate pronunciation of a word based on the pronunciation rules of their own language. Transliterations of ISO character sets use invertible mapping so that the transliteration can be reversed without any information loss. However, for other character sets, such as BGN, transliteration is not reversible.

Informatica Address Verification can transliterate to and from the following writing systems:

Chinese Pinyin (Mandarin, Cantonese)

Cyrillic (BGN/PCGN 1947, ISO 9 – 1995)

You can perform Cyrillic transliteration for Belarus, Bulgaria, Kazakhstan, Macedonia, Russia, and Ukraine.

Greek (BGN/PCGN 1962, ISO 843 – 1997)

Hebrew

Japanese Katakana, Hiragana, and Kanji

Transliteration goes beyond character set mapping, which is a mapping between different numeric representations of a character. A language such as Japanese, with Katakana, Hiragana, and Kanji characters, has sounds with no direct representation in the English language. However, each Japanese character has an associated sound that can be approximated phonetically in Latin characters.

The following table shows transliteration of sample characters from different character sets:

Source Character Set	Input	Destination Character Set	Output
Latin	Ä	ASCII	AE
Latin	ĝ	ASCII	g
Kanji (Japanese)	市	Latin	shi
Cyrillic	Ж	Latin	ZH

Rename Saved Search

Table of Contents

On-Premises Developer Guide

On-Premises Developer Guide

Transliteration

Transliteration