Table of Contents

Search

  1. Preface
  2. Components
  3. Business services
  4. File listeners
  5. Fixed-width file formats
  6. Hierarchical schemas
  7. Intelligent structure models
  8. Mapplets
  9. Saved queries
  10. Shared sequences
  11. User-defined functions

Components

Components

Troubleshooting intelligent structure models

Troubleshooting
intelligent structure model
s

Consider the following troubleshooting tips when you create
intelligent structure model
s.
Using differently structured files causes data loss.
If the
intelligent structure model
does not match the input file or only partially matches the input file, there might be data loss.
For example, you created a model for a sample file that contains rows with six fields of data,
computer ID
,
computer IP address
,
access URL
,
username
,
password
, and
access timestamp
. However, some of the input files contained rows with eight fields of data, that is a
computer ID
,
computer name
,
computer IP address
,
country of origin
,
access URL
,
username
,
password
,
access code
, and
access timestamp
. The data might be misidentified and some data might be designated as unidentified data.
If some input files contain more types of data than other input files, or different types of data, for best results create a sample file that contains all the different types of data.
Data from PDF forms was not modeled or parsed.
An
intelligent structure model
can model and parse the data within PDF form fields but not data outside the fields. A field title, or other data outside the field, will not be identified.
Data from Microsoft Word was not modeled or parsed.
An
intelligent structure model
can model and parse data within Microsoft Word tables. All other data is collected as unparsed data.
Error: Unsupported field names might cause data loss.
Do not use duplicate names for different elements.
If you use Big Data Management 10.2.1, ensure that the names of output groups follow Informatica Developer naming conventions. An element name must contain only English letters (A- Z, a-z), numerals (0-9), and underscores. Do not use reserved logical terms, and do not start element names with a number.
In later versions of Big Data Management or Data Engineering Integration,
Intelligent Structure Discovery
replaces special characters in element names with underscores and inserts underscores before element names that start with numerals and before element names that are reserved logical terms.
Intelligent Structure Discovery
assigns long records to an unassigned port.
Intelligent Structure Discovery
assigns records that exceed the maximum record size to an unassigned port. The default maximum record size is 640,000 bytes.
You can increase the maximum record size by configuring one of the DTM JVM properties of the Data Integration Server service in Administrator.
Use the following syntax to define the maximum record size:
-DISD_MAX_RECORD_SIZE=<maximum record size in bytes>
For example, to define a maximum record size of 2 MB, enter the following value for the
JVMOption1
property:
-DISD_MAX_RECORD_SIZE=2000000
It is recommended that the maximum record size doesn't exceed 10 MB.
For more information about configuring Data Integration Server service properties, see the Administrator help.
When you try to base a model on a sample ORC file that contains Union data, the model creation fails.
Intelligent Structure Discovery
doesn't process the Union data type in ORC input. Select a file that doesn't contain Union data to base the model on.


Updated January 22, 2021