Table of Contents

Search

  1. Preface
  2. Introduction to Data Transformation
  3. Data Processor Transformation
  4. Wizard Input and Output Formats
  5. Relational Input and Output
  6. XMap
  7. Libraries
  8. Schema Object
  9. Command Line Interface
  10. Scripts
  11. Parsers
  12. Script Ports
  13. Document Processors
  14. Formats
  15. Data Holders
  16. Anchors
  17. Transformers
  18. Actions
  19. Serializers
  20. Mappers
  21. Locators, Keys, and Indexing
  22. Streamers
  23. Validators, Notifications, and Failure Handling
  24. Validation Rules
  25. Custom Script Components

Data Transformation User Guide

Data Transformation User Guide

XML Streamers

XML Streamers

An
XmlStreamer
component splits a large XML document into smaller portions. The
XmlStreamer
divides the XML source into header, body, and footer segments. The body segments can contain repeating or non-repeating elements. The
XmlStreamer
can pass each XML segment to an appropriate transformation, typically a Mapper or a Serializer.
An
XmlStreamer
works in much the same way as a
Streamer
, with a few differences due to the structured XML input. The following are the main features:
  • The body segments are defined as XML elements. You can configure the body with multiple elements of the same or different types, in any sequence.
  • The header is defined as the entire portion of the XML that precedes the first body segment. In the
    XmlStreamer
    configuration, it is not necessary to define the elements that comprise the header.
  • The footer is defined as the entire portion of the XML that follows the last body segment. In the
    XmlStreamer
    configuration, is not necessary to define the elements that comprise the footer.
  • In many cases, the header and footer segments are not well-formed XML. To enable passing the segments to a Mapper or Serializer, you can configure modifier components that convert the segments to well-formed XML.
To help understand these features, consider the following source XML structure:
<stream> <headerline1>MainHeader</headerline1> <substreams> <substream> <subheaderline1>SubHeader</subheaderline1> <segments> <segment1>Segment1A</segment1> <segment1>Segment1B</segment1> <segment2>Segment2A</segment2> <segment1>Segment1C</segment1> <segment2>Segment2B</segment2> </segments> <subfooterline1>SubFooter</subfooterline1> </substream> <substream>...</substream> <substream>...</substream> </substreams> <footerline1>MainFooter</footerline1> </stream>
In this example, you might define the body segments as the
substream
elements. The header is everything that precedes the first
substream
:
<stream> <headerline1>MainHeader</headerline1> <substreams>
The footer is everything that follows the last
substream
:
</substreams> <footerline1>MainFooter</footerline1> </stream>
The header and footer segments are not well-formed XML. You can apply modifiers that add closing or opening tags to make them well-formed. For example, a modifier can convert the header to:
<stream> <headerline1>MainHeader</headerline1> <substreams> </substreams> </stream>
You can configure the
XmlStreamer
to pass the header segment, the footer segment, and each instance of the
substream
segment to an appropriate transformation, such as a Mapper or Serializer.
The header segment elements are available when processing the body elements. The footer elements, however, are not yet available when processing body elements. Footer elements are only processed after the transformation finishes reading body elements.
Alternatively, you might subdivide the
substream
elements into
segment1
and
segment2
segments, and send each of these to its own Mapper or Serializer. Notice that
segment1
and
segment2
follow each other in a random sequence. The
XmlStreamer
ignores the sequence and processes
segment1
and
segment2
in whatever order they occur.
The following figure illustrates the configuration for this purpose. The Script defines independent Serializers for the header, footer,
segment1
, and
segment2
segments.
global level XmlStreamer1 = XmlStreamer level 2 Header = XmlSegment level 3 run_component = HeaderSerializer level 2 Footer = XmlSegment level 3 run_component = FooterSerializer level 2 sub_elements = level 3 ComplexXmlSegment >> level 4 locator = /stream/*s/substreams/*s/substream level 4 sub_elements = level 5 SimpleXmlSegment >> level 6 locator = /stream/*s/substreams/*s/substream/*s/segments/*s/*s1/segment1 level 6 run_component = Segment1Serializer level 5 SimpleXmlSegment >> level 6 locator = /stream/*s/substreams/*s/substream/*s/segments/*s/*s1/segment2 level 6 run_component = Segment2Serializer level 4 ... level 3 ... global level HeaderSerializer = Serializer() global level FooterSerializer = Serializer() global level Segment1Serializer = Serializer() global level Segment2Serializer = Serializer()
Even through the footer run component appears before the body elements in this example, footer elements are only processed after the transformation finishes reading body elements.
As a further refinement, you can define transformations for the nested headers and footers within each substream element.