Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Troubleshooting Midstream Parsing

Troubleshooting Midstream Parsing

Consider the following troubleshooting tips when you use midstream mappings to parse hierarchical data:
The midstream mapping did not parse the complex data completely.
If the parser encounters a data field that it cannot parse, the parser ignores the unrecognized field and does not fail the mapping job.
The PARSE_JSON and PARSE_XML complex functions ignore data in the following situations:
  • The JSON or XML structure is invalid.
  • The JSON or XML source file contains invalid or unrecognized data.
  • The data does not match the schema provided in the type definition library.
  • The complex data schema of the source string changes. For example, a software update on the server might change automated JSON or XML input.
  • The wrong element is selected as the root, which can cause the parser to flatten parent data in an array.
To troubleshoot unparsed data, capture the data from the source string and save it in an
UnassignedData
array as an
unidentifiedDataItem
. Then, use this information to analyze the data.
Contact Informatica Global Customer Support and refer to the internal KB 627032.
This debugging tool can have an impact on system performance.
The midstream mapping loses data when parsing arrays.
Data loss occurs when the PARSE_JSON or PARSE_XML functions parse a schema that contains an array or a multidimensional array with struct elements. The resulting data loss depends on the array type. For example:
  • Array with struct elements. The parser function returns only the first element of an array. The remaining array elements are ignored.
  • Multidimensional array with struct elements. The parser function returns a struct instead of an array, for an intermediate array element.
When you create the intelligent structure model using sample data,
Intelligent Structure Discovery
selects an intermediate element as the root node. For example, when an array contains struct elements,
Intelligent Structure Discovery
selects the first struct element as the root node. This behavior causes data loss for midstream parsing. To avoid data loss, select the top-level node as root.
Use
Informatica Intelligent Cloud Services
Intelligent Structure Discovery
to modify the intelligent structure model.
Select the top-level node, and then right-click and select
Mark As Root
. The following image shows how to modify the root node:
The intelligent structure model contains a top-level node, which is an array. An intermediate array element is the current root node. The right-click menu shows the 
						Mark As Root option.
The brown-colored node shows the current root node. The green-colored node shows the selected root node.
For more information, see the "Intelligent structure models" section of the
Informatica Intelligent Cloud Services
Data Integration
help.
The midstream mapping does not work when the data or field names contain special characters.
The input data is encoded as a UTF-8 string. Check the input flat file data object configuration in the Developer tool.
In the Object Explorer view, navigate to the flat file data object in the Physical Data Objects
Advanced tab
Format
section and set the
Code page
attribute to
UTF-8
.


Updated September 28, 2020