The Structure Parser transformation transforms your input data into a user-defined structured format based on an
intelligent structure model
. You can use the Structure Parser transformation to analyze data such as log files, clickstreams, XML or JSON files, Word tables, and other unstructured or semi-structured formats.
You can connect a Structure Parser transformation to the following types of sources:
A Source transformation based on a flat file to process local input files
A Source transformation based on a Hadoop Files V2 connection to stream input files in HDFS or to process local input files
When you configure a Structure Parser transformation, you associate it with an
intelligent structure model
. An
intelligent structure model
is an asset that
Intelligent Structure Discovery
generates to represent the data that you expect the model to parse at run time. You can create a model before you configure the Structure Parser transformation or as you configure it.
Intelligent Structure Discovery
generates the
intelligent structure model
based on a sample of your input data or a schema that you provide. You can create a
model from various input types. For more information about the input types, see Inputs for intelligent structure models.
After
Intelligent Structure Discovery
generates the
intelligent structure model
, you can refine the model and customize the structure of the output data. You can edit the nodes in the model to combine, exclude, flatten, or collapse them.