Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings in the Hadoop Environment
  4. Mapping Sources in the Hadoop Environment
  5. Mapping Targets in the Hadoop Environment
  6. Mapping Transformations in the Hadoop Environment
  7. Processing Hierarchical Data on the Spark Engine
  8. Configuring Transformations to Process Hierarchical Data
  9. Processing Unstructured and Semi-structured Data with an Intelligent Structure Model
  10. Stateful Computing on the Spark Engine
  11. Monitoring Mappings in the Hadoop Environment
  12. Mappings in the Native Environment
  13. Profiles
  14. Native Environment Optimization
  15. Cluster Workflows
  16. Connections
  17. Data Type Reference
  18. Function Reference
  19. Parameter Reference

Intelligent Structure Discovery Process

Intelligent Structure Discovery
Process

You can create a CLAIRE™
Intelligent structure model
in
Intelligent Structure Discovery
.
Intelligent Structure Discovery
is a service in
Data Integration
.
When you provide a sample file,
Intelligent Structure Discovery
determines the underlying structure of the information and creates a model of the structure. After you create an
Intelligent structure model
you can view, edit, and refine it. For example, you can select to exclude or combine structure elements. You can normalize repeating groups.
When you finish refining the model, you can export it and then associate it with a data object in a Big Data Management mapping.
The following image shows the process by which Intelligent Structure Discovery deciphers the underlying patterns of data and creates a model of the data patterns.
This image shows unstuctured or structured data being deciphered by Intelligent Structure Discovery, which creates a model of the data.
You can create models for semi-structured data from Microsoft Excel, Microsoft Word tables, PDF forms, and CSV files, or unstructured text files. You can also create models for structured data such as XML and JSON files.
You can quickly model data for files whose structure is very hard, time consuming, and costly to find, such as log files, clickstreams, customer web access, error text files, or other internet, sensor, or device data that does not follow industry standards.


Updated October 23, 2019