Consider the following troubleshooting tips when you create
intelligent structure model
s.
Using differently structured files causes data loss.
If the
intelligent structure model
does not match the input file or only partially matches the input file, there might be data loss.
For example, you created a model for a sample file that contains rows with six fields of data,
computer ID
,
computer IP address
,
access URL
,
username
,
password
, and
access timestamp
. However, some of the input files contained rows with eight fields of data, that is a
computer ID
,
computer name
,
computer IP address
,
country of origin
,
access URL
,
username
,
password
,
access code
, and
access timestamp
. The data might be misidentified and some data might be designated as unidentified data.
If some input files contain more types of data than other input files, or different types of data, for best results create a sample file that contains all the different types of data.
Data in a Microsoft Word or Microsoft Excel file wasn't parsed.
When
Intelligent Structure Discovery
creates a model that is based on a Microsoft Word or Microsoft Excel file, it might discover unstructured data as an unparsed node and exclude the node from the model structure and from the output, for example, when the file contains free text. You can edit the model to include excluded nodes in the structure. For more information, see
Editing the nodes.
Data in PDF forms wasn't modeled or parsed.
An
intelligent structure model
parses data within PDF form fields. Ensure that the PDF form includes fields.
Error: Unsupported field names might cause data loss.
Do not use duplicate names for different elements.
If you use Big Data Management 10.2.1, ensure that the names of output groups follow Informatica Developer naming conventions. An element name must contain only English letters (A- Z, a-z), numerals (0-9), and underscores. Do not use reserved logical terms, and do not start element names with a number.
In later versions of Big Data Management or Data Engineering Integration,
Intelligent Structure Discovery
replaces special characters in element names with underscores and inserts underscores before element names that start with numerals and before element names that are reserved logical terms.
When you try to base a model on a sample ORC file that contains Union data, the model creation fails.
Intelligent Structure Discovery
doesn't process the Union data type in ORC input. Select a file that doesn't contain Union data to base the model on.