Creating an intelligent structure model using Claire engine
Creating an intelligent structure model using Claire engine
Create an
intelligent structure model
based on input that represents the data that you expect the model to parse at run time.
Click
New
Components
Intelligent Structure Model
, and then click
Create
.
On the
Intelligent Structure Model
page, enter a name for the
intelligent structure model
.
The name can contain alphanumeric characters and underscores.
Navigate to the project and folder where you want to save the model, or use the default location.
You can change the name or location after you save the
intelligent structure model
in the
Explore
page.
Under
AI-powered Model
, select
Claire
Engine
.
You can create models from the following input types when you use the Claire
engine:
Avro files
Cobol copybooks
Data within PDF form fields
Data within Microsoft Word
tables
JSON files
Machine generated files
such as weblogs and clickstreams
Microsoft Excel files
ORC files
PDF files, including scanned PDFs
Parquet files
Text files, including
delimited files such as CSV files and complex files that contain
textual hierarchies
XML files
XSD files
Image files with .jpg, .jpeg, and .png
extensions.
Based on the type of input file, perform one of the following steps:
To use a JSON sample file,
first choose whether to base the model on a file sampling or on the entire
file. Select the file and then click
Discover
Structure
.
To use an XML sample file,
first choose whether to base the model on a file sampling or on the entire
file. Select the file, choose how you want to define the output groups, and
then click
Discover Structure
.
To use a PDF file, first
choose whether to base the model on a file sampling or on the entire file.
Select the file, choose how you want to define the output groups, and then
click
Discover Structure
.
To use an Avro schema file
or any other type of sample file, select the file and click
Discover Structure
.
To use an XSD schema file,
first choose whether to base the model on a file sampling or, if the schema
is larger than 1.5 MB, to base the model on the entire schema. Select the
file, verify that the schema root is selected, choose how you want to define
the output groups, and click
Discover Structure
. If
you intend to use the model in a Structure Parser transformation, you can
click
Upload XML Sample
and select an XML sample file
to attach to the model.
To use a Cobol copybook,
select the copybook. If required, modify the values of
File
organization
and
Code Page
to use at
run time. Click
Discover Structure
.
After you click
Discover Structure
,
Intelligent Structure Discovery
deciphers the data in the input and discovers the patterns
expressed in the data. The following image shows an example of discovered
structure on the
Visual Model
tab:
Intelligent Structure Discovery
creates nodes with unique names. If
Intelligent Structure Discovery
discovers instances of the same type of data, it adds a
number suffix to the node names. For example, if the input contains
timestamps in two tables,
Intelligent Structure Discovery
names them
timestamp1
and
timestamp2
.
When you base the model on an Avro, ORC, or Parquet file,
Intelligent Structure Discovery
discovers both the data elements and the elements of the file schema. By default,
Intelligent Structure Discovery
excludes elements that appear only in the schema from the model. To add schema elements to the output, include them in the structure of the model. For more information, see
Performing actions on multiple nodes.
For a model that you create for an Excel worksheet,
Intelligent Structure Discovery
creates metadata nodes with sheet index and name. By default,
Intelligent Structure Discovery
excludes those nodes from the structure of the model. To add the nodes to the output, include them in the structure. For more information, see
Edit the structure of Microsoft Excel input.
You can refine the structure so that when you use the model in production, the
output meets your requirements. For more information, see Refining intelligent structure models.