Table of Contents

Search

  1. Preface
  2. Components
  3. API collections
  4. Business services
  5. File listeners
  6. Fixed-width file formats
  7. Hierarchical schemas
  8. Intelligent structure models
  9. Refining intelligent structure models
  10. Mapplets
  11. Saved queries
  12. Shared sequences
  13. User-defined functions

Components

Components

Troubleshooting intelligent structure models

Troubleshooting
intelligent structure model
s

Consider the following troubleshooting tips when you create
intelligent structure model
s.
Using differently structured files causes data loss.
If the
intelligent structure model
does not match the input file or only partially matches the input file, there might be data loss.
For example, you created a model for a sample file that contains rows with six fields of data,
computer ID
,
computer IP address
,
access URL
,
username
,
password
, and
access timestamp
. However, some of the input files contained rows with eight fields of data, that is a
computer ID
,
computer name
,
computer IP address
,
country of origin
,
access URL
,
username
,
password
,
access code
, and
access timestamp
. The data might be misidentified and some data might be designated as unidentified data.
If some input files contain more types of data than other input files, or different types of data, for best results create a sample file that contains all the different types of data.
Data in a Microsoft Word or Microsoft Excel file wasn't parsed.
When
Intelligent Structure Discovery
creates a model that is based on a Microsoft Word or Microsoft Excel file, it might discover unstructured data as an unparsed node and exclude the node from the model structure and from the output, for example, when the file contains free text. You can edit the model to include excluded nodes in the structure. For more information, see Editing the nodes.
Data in PDF forms wasn't modeled or parsed.
An
intelligent structure model
parses data within PDF form fields. Ensure that the PDF form includes fields.
Error: Unsupported field names might cause data loss.
Do not use duplicate names for different elements.
If you use Big Data Management 10.2.1, ensure that the names of output groups follow Informatica Developer naming conventions. An element name must contain only English letters (A- Z, a-z), numerals (0-9), and underscores. Do not use reserved logical terms, and do not start element names with a number.
In later versions of Big Data Management or Data Engineering Integration,
Intelligent Structure Discovery
replaces special characters in element names with underscores and inserts underscores before element names that start with numerals and before element names that are reserved logical terms.
Intelligent Structure Discovery
assigns long records to an Unassigned Data field.
Intelligent Structure Discovery
assigns records that exceed the maximum record size to an Unassigned Data field. The default maximum record size is 640,000 bytes.
You can increase the maximum record size up to 10 MB by configuring a Data Integration Server service DTM JVM property in Administrator.
Use the following syntax to define the maximum record size:
-DISD_MAX_RECORD_SIZE=<maximum record size in bytes>
For example, to define a maximum record size of 2 MB, enter the following value for the
JVMOption1
property:
-DISD_MAX_RECORD_SIZE=2000000
For more information about configuring Data Integration Server service properties, see the Administrator help.
When you try to base a model on a sample ORC file that contains Union data, the model creation fails.
Intelligent Structure Discovery
doesn't process the Union data type in ORC input. Select a file that doesn't contain Union data to base the model on.

0 COMMENTS

We’d like to hear from you!