Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings in the Hadoop Environment
  4. Mapping Sources in the Hadoop Environment
  5. Mapping Targets in the Hadoop Environment
  6. Mapping Transformations in the Hadoop Environment
  7. Processing Hierarchical Data on the Spark Engine
  8. Configuring Transformations to Process Hierarchical Data
  9. Processing Unstructured and Semi-structured Data with an Intelligent Structure Model
  10. Stateful Computing on the Spark Engine
  11. Monitoring Mappings in the Hadoop Environment
  12. Mappings in the Native Environment
  13. Profiles
  14. Native Environment Optimization
  15. Cluster Workflows
  16. Connections
  17. Data Type Reference
  18. Function Reference
  19. Parameter Reference

Big Data Management User Guide

Big Data Management User Guide

Use Case

Use Case

You work in an operations group for an insurance company. Your team wants to process web logs to identify operations and security issues.
Your back-end system collects data on system access and security activities in various branches of the company. When the data is collected, the data is stored in the corporate data center and hosted in Amazon S3 storage.
Your team wants to understand the types of operations issues that have caused the most errors and system downtime in the past few weeks. You want to store data afterwards for auditing purposes.
Before your data analysts can begin working with the data, you need to parse the data in Amazon S3 input buckets and produce actionable data. But you cannot spend the time and resources required to sift through the data to create models of analysis. You might have to develop numerous mappings and parameter sets to parse the data to make sure that actionable data is created from the weblogs.
Instead of manually creating individual transformations, your team can use automatically generated
Intelligent structure model
s to determine the relevant data sets. You create an
Intelligent structure model
in
Intelligent Structure Discovery
, an application in
Data Integration
that uses machine learning algorithms to decipher data in structured or unstructured data files and discover the underlying structure of the data.
Intelligent Structure Discovery creates an
Intelligent structure model
that represents the input file data structure. You create a mapping with a data object that uses the intelligent structure model to output actionable data sets.
After the mapping fetches data from Amazon S3 input buckets, the mapping processes the data with an
Intelligent structure model
to prepare the data, and can then write the data to Amazon S3 output buckets.
The following image shows the process to fetch the data from Amazon S3 input buckets, parse and prepare the data, and then write the data to Amazon S3 output buckets. Analysts can then use the data to handle security issues and improve operations management.
This image shows an S3 data object reading log data and passing it to a Big Data Management mapping. The mapping processes the ddata on the Spark engine and writes the data to the Amazon S3 output buckets.

0 COMMENTS

We’d like to hear from you!