Table of Contents

Search

  1. Preface
  2. Introduction to Data Transformation
  3. Data Processor Transformation
  4. Wizard Input and Output Formats
  5. Relational Input and Output
  6. XMap
  7. Libraries
  8. Schema Object
  9. Command Line Interface
  10. Scripts
  11. Parsers
  12. Script Ports
  13. Document Processors
  14. Formats
  15. Data Holders
  16. Anchors
  17. Transformers
  18. Actions
  19. Serializers
  20. Mappers
  21. Locators, Keys, and Indexing
  22. Streamers
  23. Validators, Notifications, and Failure Handling
  24. Validation Rules
  25. Custom Script Components

Data Transformation User Guide

Data Transformation User Guide

Example 1: Nested Multiple-Occurrence Data Holders

Example 1: Nested Multiple-Occurrence Data Holders

Suppose that the input schema of a serializer requires the following structure:
<Report> <Company> <Employee>John</Employee> <Employee>Leslie</Employee> <Employee>Pedro</Employee> </Company> <Company> <Employee>Marie</Employee> <Employee>Larry</Employee> <Employee>Frances</Employee> </Company> </Report>
You want to iterate over all the
Employee
elements and produce the following output:
John Leslie Pedro Marie Larry Frances
You might create a
RepeatingGroupSerializer
and configure it to output the
Employee
data holder.
global level EmployeeSerializer = Serializer >> level 2 contains line level 2 RepeatingGroupSerializer >> level 3 separator_position = before level 3 separator = ... level 3 contains line level 3 ContentSerializer >> level 4 opening_str = "" level 4 opening_str = "" level 4 data_holder = /Report/*s/Company/*s/Employee level 3 ... level 2 ...
This does not work correctly. By default, each iteration selects a new instance of
Employee
within the same
Company
. The result is the following output:
John Leslie Pedro
In other words, the
RepeatingGroupSerializer
accesses only the first
Company
.
You can solve the problem by nesting the
RepeatingGroupSerializer
inside another
RepeatingGroupSerializer
. To resolve any potential ambiguities, you can configure the
source
properties explicitly.
global level EmployeeSerializer=Serializer >> level 2 contains line level 2 RepeatingGroupSerializer >> level 3 separator_position=before level 3 separator=...level 3 source=... level 4 Locator = >> level 5 data_holder=Report/*s/Company level 4...level 3 contains line level 3 RepeatingGroupSerializer >> level 4 separator_position=before level 4 separator=...level 4 source=...level 5 Locator= >> level 6 data_holder=/Report/*s/Company level 5...level 4 contains line level 4 ContentSerializer (,,/Report/*s/Company/*s/Employee)
Each iteration of the outer
RepeatingGroupSerializer
processes a different occurrence of
Company
. Each iteration of the nested
RepeatingGroupSerializer
processes a different occurrence of
Employee
. The result is the desired output.
Alternatively, suppose you want to iterate only over the second
Employee
element in each
Company
. The desired output is:
Leslie Larry
You can do this by configuring a single
RepeatingGroupSerializer
, whose source is
Company
. This causes each iteration to access the next instance of
Company
. Within the iteration, you can configure a
GroupSerializer
, whose
source
property uses a
LocatorByOccurrence
to select the second
Employee
. This generates the desired output.
global level EmployeeSerializer=Serializer >> level 2 contains line level 2 RepeatingGroupSerializer >> level 3 separator_position=before level 3 separator=... level 3 source=... level 4 Locator= >> level 5 data_holder=/Report/*s/Company level 4...level 3 contains line level 3 GroupSerializer >> level 4 source=... level 5 LocatorByOccurrence=>> level 6 recurring_element=/Report/*s/Company/*s/Employee level 6 occurrence_number=2 level 5...level 4 contains line level 4 ContentSerializer (,,/Report/*s/Company/*s/Employee) level 3 ... level 2 ...