You can use a Source transformation in advanced mode to read hierarchical data from
complex files, such as Avro, JSON, and Parquet files. Advanced mode represents the data as
an array, map, or struct.
To read hierarchical data, set the format on the
Source
tab to a
hierarchical format, such as JSON, or to
Discover Structure
. Use
Discover Structure
when you want to use an
intelligent structure model
to
define the structure of your data.
For more information, see
Components
.
Downstream in the mapping, you can use the hierarchical fields as pass-through fields to
convert data from one complex file format to another. For example, you can read
hierarchical data from an Avro source and write the data to a JSON target. You can also
use the hierarchical fields and their child fields in expressions and conditions in
downstream transformations. For information about accessing child fields, see the
Function Reference
.
You can pass hierarchical fields to the following transformations:
Target
Aggregator
Expression
Filter
Hierarchy Processor
Joiner
Rank
Router
Sequence Generator
Sorter
Rules and guidelines for reading hierarchical data
Consider the following guidelines when you read hierarchical data:
You must use an Amazon S3 V2 or Azure Data Lake Storage Gen2 connection to read hierarchical data. For more information, see the help for the appropriate connector.
To read data from an XML source, use an
intelligent structure model
in the Source transformation. For information about
intelligent structure model
s, see
Components
.
You cannot use a parameter for the source connection or the source object.
If hierarchical fields contain child fields with decimal data types, the mapping runs using low precision.
The transformation sets the precision and scale based on the values in the first row of data. Note that this first row is sometimes referred to as row 0.
To avoid data truncation, increase the precision and scale in the first row of data. Also ensure that the first row does not include null values.