Overview of Hierarchical Data Processing with Schema Changes
Overview of Hierarchical Data Processing with Schema Changes
You can use dynamic complex ports to manage schema changes to hierarchical data. You use dynamic ports to receive new and changed hierarchical data columns from a dynamic complex file source. Then, to manage schema or metadata changes of a complex port, you use dynamic complex ports.
To process hierarchical data on the Spark engine, you use complex ports in transformations. To handle metadata changes at run time, you create dynamic mappings with dynamic ports. A dynamic port can propagate hierarchical data through a generated port of a complex data type. However, if the schema of the hierarchical data change at run time, you must use dynamic complex ports to pass hierarchical data in the mapping.
Complex ports require you to specify the type configuration based on the data type of the elements in an array or a map or the schema of a struct. With dynamic complex ports, you do not have to specify the type configuration for a hierarchical column. For example, you do not have to specify a complex data type definition for a dynamic struct.
With dynamic complex ports, you can perform the following tasks in a dynamic mapping:
Receive new or changed elements of a complex port if the schema of the complex port changes at run time.
Modify hierarchical data or filter elements of a hierarchical data with input rules for a dynamic complex port.
Extract elements of a hierarchical data that might have schema changes at run time using complex operators.
Generate arrays and structs with dynamic expressions that use complex functions.
For more information about dynamic mappings, see the
Informatica Developer Mapping Guide
.
You cannot process hierarchical data with schema changes on the Databricks Spark engine.