Table of Contents

Search

  1. Preface
  2. Components
  3. API collections
  4. Business services
  5. File listeners
  6. Fixed-width file formats
  7. Hierarchical schemas
  8. Intelligent structure models
  9. Refining intelligent structure models
  10. Mapplets
  11. Saved queries
  12. Shared sequences
  13. User-defined functions

Components

Components

Inputs for intelligent structure models

Inputs for intelligent structure models

The input that you base an
intelligent structure model
on can be a sample file, an XSD schema, an Avro schema, or a Cobol copybook, based on the input that you expect to use the model for at run time.
Input files can be up to 1 MB in size. An input file can contain up to 30,000 simple fields. If the file contains more than 30,000 simple fields,
Intelligent Structure Discovery
creates the model without groups and ports. The number of levels in the hierarchy isn't limited.
To achieve optimal parsing results, ensure that the input that you provide when you create the model is broad enough and covers all the data elements that you expect the model to receive at run time. If the input is too limited, the parsing output will include unidentified data.
Use simplified input to generate the model. For example, if the input data has tables, provide a table with just a few sample rows rather than many rows of data. If you use a JSON input file that contains repeating groups of data, limit the number of repetitions.
If the model does not match the runtime input data, or only partially matches the input data, there might be a large amount of unidentified data and data loss. However, some variations will still be parsed.
Verify that the length of each combination of group name and field name doesn't exceed 80 characters. For example, if the name of the group that a field belongs to is
group
, the field name must not exceed 75 characters. If a combination of group name and field name exceeds 80 characters, mappings that use the model fail to run.

Using ORC files

You can use the model to read ORC files through a flat file connection in
Data Integration
. You can't use the model for ORC streaming.

Using multiple sample files in a model

After you create a model based on a JSON, XML, ORC, AVRO, or PARQUET sample file, you can use additional sample files to enrich the structure with fields that exist in the new samples. The additional files must be of the same file type as the type of file that the model is based on.

Using multi-file XSD schemas

Consider the following guidelines when you use an XSD schema that contains multiple XSD files as the model input:
  • The schema files must be compressed.
  • If the XSD files reside in a directory structure, to preserve the structure, the parent directory must be compressed.

Using XML sample files in XSD-based models

When you create an XSD-based model to use in a Structure Parser transformation, you can attach an XML sample file to the model. The names and contents of the groups in the model appear in the
Intelligent Structure Model
page. When you associate the model with the Structure Parser transformation, use this information to decide which group to connect to the target. Attaching a sample file to the model doesn't affect or change the structure of the model.

Parsing JSON-encoded Avro messages

You can use models that are based on an Avro schema to parse JSON-encoded Avro messages.

0 COMMENTS

We’d like to hear from you!