Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings in the Hadoop Environment
  4. Mapping Sources in the Hadoop Environment
  5. Mapping Targets in the Hadoop Environment
  6. Mapping Transformations in the Hadoop Environment
  7. Processing Hierarchical Data on the Spark Engine
  8. Configuring Transformations to Process Hierarchical Data
  9. Processing Unstructured and Semi-structured Data with an Intelligent Structure Model
  10. Stateful Computing on the Spark Engine
  11. Monitoring Mappings in the Hadoop Environment
  12. Mappings in the Native Environment
  13. Profiles
  14. Native Environment Optimization
  15. Cluster Workflows
  16. Connections
  17. Data Type Reference
  18. Function Reference
  19. Parameter Reference

Processing Unstructured and Semi-structured Data with Intelligent Structure Model Overview

Processing Unstructured and Semi-structured Data with
Intelligent Structure Model
Overview

You can use CLAIRE™
Intelligent Structure Discovery
to parse semi-structured or structured data in mappings that run on the Spark engine.
Long, complex files with little or no structure can be difficult to understand much less parse. CLAIRE™
Intelligent Structure Discovery
can automatically discover the structure in unstructured data.
CLAIRE™ uses machine learning algorithms to decipher data in semi-structured or unstructured data files and create a model of the underlying structure of the data. You can generate an
Intelligent structure model
, a model of the pattern, repetitions, relationships, and types of fields of data discovered in a file, in
Informatica Intelligent Cloud Services
.
To use the model, you export it from
Data Integration
, and then can associate it with a data object in a Big Data Management mapping. You can run the mapping on the Spark engine to process the data. The mapping uses the
Intelligent structure model
to extract and parse data from input files based on the structure expressed in the model.


Updated October 23, 2019