Table of Contents

Search

  1. Preface
  2. Introduction to Data Transformation
  3. Data Processor Transformation
  4. Wizard Input and Output Formats
  5. Relational Input and Output
  6. Using the IntelliScript Editor
  7. XMap
  8. Libraries
  9. Schema Object
  10. Command Line Interface
  11. Scripts
  12. Parsers
  13. Script Ports
  14. Document Processors
  15. Formats
  16. Data Holders
  17. Anchors
  18. Transformers
  19. Actions
  20. Serializers
  21. Mappers
  22. Locators, Keys, and Indexing
  23. Streamers
  24. Validators, Notifications, and Failure Handling
  25. Validation Rules
  26. Custom Script Components

Character Encoding

Character Encoding

A character encoding is a mapping of the characters from a language or group of languages to hexadecimal code.
When you design a Script, you define the encoding of the input documents and the encoding of the output documents. Define the working encoding to define how the IntelliScript editor displays characters and how the Data Processor transformation processes the characters.

Working Encoding

The working encoding is the code page for the data in memory and the code page for the data that appears in the user interface and work files. You must select a working encoding that is compatible with the encoding of the schemas that you reference in the Data Processor transformation.
The following table shows the working encoding settings:
Setting
Description
Use the Data Processor Default Code Page
Uses the default encoding from the Data Processor transformation.
Other
Select the encoding from the list.
XML Special Characters Encoding
Determines the representation of XML special characters. You can select
None
or
XML.
  • None.
    Leave as & < > " '
    Entity references for XML special characters are interpreted as text. For example, the character
    >
    appears as
    >
    Default is none.
  • XML
    . Convert to & < > " '
    Entity references for XML special characters are interpreted as regular characters. For example, &gt; appears as the following character:
    >

Input Encoding

The input encoding determines how character data is encoded in input documents. You can configure the encoding for additional input ports in a Script.
The following table describes the encoding settings in the
Input
area:
Setting
Description
Use Encoding Specified in Input Document
Use the codepage that the source document defines, such as the encoding attribute of an XML document.
If the source document does not have an encoding specification, the Data Processor transfomation uses the encoding settings from the
Settings
view.
Use Working Encoding
Use the same encoding as the working encoding.
Other
Select the input encoding from a drop-down list.
XML Special Characters Encoding
Determines the representation of XML special characters. You can select
None
or
XML.
  • None.
    Leave as &amp; &lt; &gt; &quot; &apos;
    Entity references for XML special characters are interpreted as text, for example, the character
    >
    appears as
    &gt;
    Default in None.
  • XML
    . Convert to & < > " '
    Entity references for XML special characters are interpreted as regular characters. For example, &gt; appears as the following character:
    >
Byte Order
Describes how multi-byte characters appear in the input document. You can select the following options:
  • Little-endian. The least significant byte appears first. Default.
  • Big-endian. The most significant byte appears first.
  • No binary conversion.

Output Encoding

The output encoding determines how character data is encoded in the main output document.
The following table describes the encoding settings in the
Output
area:
Setting
Description
Use Working Encoding
The output encoding is the same as the working encoding.
Other
The user selects the output encoding from the list.
XML Special Characters Encoding
Determines the representation of XML special characters. You can select
None
or
XML.
  • None.
    Leave as &amp; &lt; &gt; &quot; &apos;
    Entity references for XML special characters are interpreted as text, for example, the character
    >
    appears as
    &gt;
    Default.
  • XML
    . Convert to & < > " '
    Entity references for XML special characters are interpreted as regular characters. For example, &gt; appears as the following character:
    >
Same as Input Encoding
The output encoding is the same as the input encoding.
Byte order
Describes how multi-byte characters appear in the input document. You can select the following options:
  • Little-endian. The least significant byte appears first. Default.
  • Big-endian. The most significant byte appears first.
  • No binary conversion.


Updated June 14, 2019