Announcement: New Design for docs.informatica.com!
We have updated the look and feel of docs.informatica.com. To load the latest design, press CTRL-F5 to clear the pages you previously cached on our site, or simply restart your browser.
Rules and Guidelines for Processing Hierarchical Data on the Spark Engine
Rules and Guidelines for Processing Hierarchical Data on the Spark Engine
There are processing differences when you work with complex data types in a mapping that runs on the Spark engine.
Consider the following rules and guidelines when you use complex data types in a mapping that runs on the Spark engine:
You cannot read hierarchical data from or write hierarchical data to a Hive source in a dynamic mapping.
When you read hierarchical data from a Hive source, you cannot enable Hive LLAP for Hive queries.
When you read hierarchical data from a Hive source, the Spark engine converts float type data to double. Use the double data type when you read from and write to a Hive source to prevent precision errors.
When you write date/time data within a complex data type to a Hive target using HDP 3.1, configure the timezone as UTC. In the Hadoop connection Spark advanced properties, append “-Duser.timezone=UTC” to the end of the value for the following properties: