JupyterLab Extension for INFACore

JupyterLab Extension for INFACore

Parse unstructured data

Parse unstructured data

You can transform your input data into a user-defined structured format based on an
intelligent structure model
. You can use the
Parse Unstructured Data
function to analyze data such as log files, XML, or JSON files, Word tables, and other unstructured or semi-structured formats.
To parse unstructured data, use the
Parse Unstructured Data
function. The function uses
Intelligent Structure Discovery
to determine the underlying structure of the sample data file and creates a model of the structure.
Intelligent Structure Discovery
creates the
intelligent structure model
based on a sample of your input data.
You can create models from the following input types:
  • Text files, including delimited files such as CSV files and complex files that contain textual hierarchies
  • Machine generated files such as weblogs and clickstreams
  • JSON files
  • XML files
  • ORC files
  • Avro files
  • Parquet files
  • Microsoft Excel files
  • Data within PDF form fields
  • Data within Microsoft Word tables
  • XSD files
You can refine the
intelligent structure model
and customize the structure of the output data. You can edit the nodes in the model to combine, exclude, flatten, or collapse them.
You can process input from the source efficiently and seamlessly based on the
intelligent structure model
that you select. When you select the function, you associate it with the
intelligent structure model
. Select a data source based on a flat file to process local input files.

0 COMMENTS

We’d like to hear from you!