Table of Contents

Search

  1. Preface
  2. Introduction to Data Transformation
  3. Data Processor Transformation
  4. Wizard Input and Output Formats
  5. Relational Input and Output
  6. Using the IntelliScript Editor
  7. XMap
  8. Libraries
  9. Schema Object
  10. Command Line Interface
  11. Scripts
  12. Parsers
  13. Script Ports
  14. Document Processors
  15. Formats
  16. Data Holders
  17. Anchors
  18. Transformers
  19. Actions
  20. Serializers
  21. Mappers
  22. Locators, Keys, and Indexing
  23. Streamers
  24. Validators, Notifications, and Failure Handling
  25. Validation Rules
  26. Custom Script Components

Delimiters Component Reference

Delimiters Component Reference

A delimiters component defines a hierarchy of characters or strings that organize the information in a document, such as newlines, spaces, tabs, commas, or vertical bars. You can also use a wildcard pattern to define the delimiters.
The delimiter concept is applicable both to rigidly structured documents that use predefined delimiter characters to separate the data fields, and to loosely structured text or HTML documents that are delimited by newlines and syntactic markup. The delimiter concept also encompasses positionally-structured data, where the fields are located at fixed offsets from one another.
The Parser uses the delimiters to determine the search criteria of
Content
anchors configured with the
LearnByExample
option.
For example, suppose you configure a format with the
TabDelimited
delimiters component. This defines a hierarchy using the following characters as delimiters:
Newline Tab
You might define a
Content
anchor that is located two tab characters after the preceding
Marker
anchor in the example source, like this:
MARKER<tab>abc<tab>CONTENT
When a Parser processes a source document, it searches for the
Content
two tabs after the
Marker
.
In a second example, you might define a
Content
anchor that is located three newlines and one tab after a
Marker
anchor, in the example source.
MARKER abc<tab>de fghi<tab>jkl<tab>mnop pqrst<tab>CONTENT
Within the intermediate lines, the tabs are not counted because the newlines are higher in the hierarchy.
Many of the delimiters components, such as
TabDelimited
or
CommaDelimited
, display a predefined hierarchy of delimiters, which you can edit as required.
The
DelimiterHierarchy
component does not have a predefined hierarchy. You can insert whatever delimiters you need.