Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings in the Hadoop Environment
  4. Mapping Sources in the Hadoop Environment
  5. Mapping Targets in the Hadoop Environment
  6. Mapping Transformations in the Hadoop Environment
  7. Processing Hierarchical Data on the Spark Engine
  8. Configuring Transformations to Process Hierarchical Data
  9. Processing Unstructured and Semi-structured Data with an Intelligent Structure Model
  10. Stateful Computing on the Spark Engine
  11. Monitoring Mappings in the Hadoop Environment
  12. Mappings in the Native Environment
  13. Profiles
  14. Native Environment Optimization
  15. Cluster Workflows
  16. Connections
  17. Data Type Reference
  18. Function Reference
  19. Parameter Reference

Big Data Management User Guide

Big Data Management User Guide

How to Develop a Mapping to Process Data with an Intelligent Structure Model

How to Develop a Mapping to Process Data with an
Intelligent Structure Model

You can create a mapping with a data object that incorporates an
Intelligent structure model
to parse data. You run the mapping on the Spark engine to process the data.
The tasks and the order in which you perform the tasks to develop the mapping depend on the mapping scenario.
The following list outlines the high-level tasks to develop and run a mapping to read and process data in files of any type that an
Intelligent structure model
can process, and then write the data to a target.
In
Data Integration
, create an
Intelligent structure model
.
Create an
Intelligent structure model
using a representative file. Export the .amodel file. Save the file locally or copy the file to the relevant file storage system.
For more information, see
Informatica Intelligent Cloud Services
Mappings
at the following link: https://network.informatica.com/onlinehelp/IICS/prod/CDI/en/index.htm#page/hh-cloud-mappings/Mappings.html.
In Big Data Management, create a connection.
Create a connection to access data in files that are stored in the relevant system. You can create the following types of connections that will work with the data objects that can incorporate an intelligent structure:
  • Hadoop Distributed File System
  • Amazon S3
  • Microsoft Azure Blob
Create a data object with an
Intelligent structure model
to read and parse source data.
  1. Create a data object with an
    Intelligent structure model
    to represent the files stored as sources. You can create the following types of data objects with an
    Intelligent structure model
    :
    • complex file
    • Amazon S3
    • Microsoft Azure Blob
  2. Configure the data object properties.
  3. In the read data object operation, enable the column file properties to project columns in the files as complex data types.
Create a data object to write data to a target.
  1. Create a data object to write the data to target storage.
  2. Configure the data object properties.
Create a mapping and add mapping objects.
  1. Create a mapping.
  2. Add a Read transformation based on the data object with the
    Intelligent structure model
    .
  3. Based on the mapping logic, add other transformations that are supported on the Spark engine. Link the ports and configure the transformation properties based on the mapping logic.
  4. Add a Write transformation based on the data object that passes the data to the target storage or output. Link the ports and configure the transformation properties based on the mapping logic.
Configure the mapping to run on the Spark engine.
Configure the following mapping run-time properties:
  1. Select Hadoop as the validation environment and Spark as the engine.
  2. Select Hadoop as the execution environment and select a Hadoop connection.
Validate and run the mapping on the Spark engine.
  1. Validate the mapping and fix any errors.
  2. Optionally, view the Spark engine execution plan to debug the logic.
  3. Run the mapping.

0 COMMENTS

We’d like to hear from you!