parser and mapper components can transform data from any format and generate XML data. When the XML data is large, you can split the XML into segments and pass the segments to an XML Parser transformation. The XML Parser transformation receives the segments and processes the XML data as one document.
When you configure the Unstructured Data transformation to split XML output, the Unstructured Data transformation returns XML based on the OutputBuffer port size. If the XML file size is greater than the output port precision, the Integration Service divides the XML into files equal to or less than the port size. The XML Parser transformation parses the XML and passes the rows to relational tables or other targets.
For example, you can extract the order header and detail information from Microsoft Word documents with a
Data Transformation
parser service.
The mapping has the following components:
Source Qualifier transformation. Passes the Word document file name to the Unstructured Data transformation. The source file name contains the complete path to the file that contains order information.
Unstructured Data transformation. The input type is file. The output type is splitting. The Unstructured Data transformation receives the source file name in the InputBuffer port. It passes the file name to
Data Transformation
Engine.
Data Transformation
Engine opens the source file, parses it, and returns XML data to the Unstructured Data transformation.
The Unstructured Data transformation receives the XML data, splits the XML file into smaller files, and passes the segments to an XML Parser transformation. The Unstructured Data transformation returns data in segments less than the OutputBuffer port size. When the transformation returns XML data in multiple segments, it generates the same pass-through data for each row. The Unstructured Data transformation returns data in pass-through ports when a row is successful or not successful.
XML Parser transformation. The Enable Input Streaming session property is enabled. The XML Parser transformation receives the XML data in the DataInput port. The input data is split into segments. The XML Parser transformation parses the XML data into order header and detail rows. It passes order header and detail rows to relational targets. It returns the pass-through data to a Filter transformation.
Filter transformation. Removes the duplicate pass-through data before passing it to the relational targets.
Relational targets. Receive data from each group in the XML Parser transformation and the Filter transformation.