Table of Contents

Search

  1. Preface
  2. Introduction to Data Transformation
  3. Data Processor Transformation
  4. Wizard Input and Output Formats
  5. Relational Input and Output
  6. Using the IntelliScript Editor
  7. XMap
  8. Libraries
  9. Schema Object
  10. Command Line Interface
  11. Scripts
  12. Parsers
  13. Script Ports
  14. Document Processors
  15. Formats
  16. Data Holders
  17. Anchors
  18. Transformers
  19. Actions
  20. Serializers
  21. Mappers
  22. Locators, Keys, and Indexing
  23. Streamers
  24. Validators, Notifications, and Failure Handling
  25. Validation Rules
  26. Custom Script Components

User Guide

User Guide

Using Alternatives to Select a Secondary Parser

Using Alternatives to Select a Secondary Parser

You can use an
Alternatives
anchor to control which of several secondary parsers processes a document. The main Parser can use this feature to process source documents of multiple types.
For example, suppose that the home page of a newspaper web site has links to articles. Following each link, the article is labeled
News
,
Business
, or
Sports
. You want to parse the articles, using a different Parser for each type, like this:
<a href="PrincessWeds.html">Norwegian Princess Weds</a> - News <a href="BanksMerge.html">Local Banks to Merge</a> - Business <a href="HomeTeamWins.html">Bears Trounce Antelopes</a> - Sports
You can support this situation in the following way:
  1. The main Parser retrieves the filename of an article and stores it in a variable.
  2. The main Parser contains an
    Alternatives
    anchor that is configured with the
    DocumentOrder
    option.
  3. The
    Alternatives
    anchor contains nested
    Group
    anchors.
  4. Each
    Group
    anchor is configured with a
    Marker
    anchor and a
    RunParser
    action, as follows:
    • The first
      Group
      contains a
      Marker
      that searches for the string
      News
      . The
      Group
      is configured with a
      RunParser
      action that runs a secondary Parser called
      NewsParser
      .
    • The second
      Group
      contains a
      Marker
      that searches for
      Business
      and runs
      BusinessParser
      .
    • The third
      Group
      contains a
      Marker
      that searches for the
      Sports
      and runs
      SportsParser
      .
The
Alternatives
anchor tests all three
Group
anchors. It accepts the
Group
containing the first
Marker
that occurs after the filename. The
Group
runs the appropriate Parser on the file.

0 COMMENTS

We’d like to hear from you!