You work in an operations group for an insurance company. Your team wants to process web logs to identify operations and security issues.
Your back-end system collects data on system access and security activities in various branches of the company. When the data is collected, the data is stored in the corporate data center and hosted in Amazon S3 storage.
Your team wants to understand the types of operations issues that have caused the most errors and system downtime in the past few weeks. You want to store data afterwards for auditing purposes.
Before your data analysts can begin working with the data, you need to parse the data in Amazon S3 input buckets and produce actionable data. But you cannot spend the time and resources required to sift through the data to create models of analysis. You might have to develop numerous mappings and parameter sets to parse the data to make sure that actionable data is created from the web logs.
Instead of manually creating individual transformations, your team can use automatically generated
intelligent structure model
s to determine the relevant data sets. You create an
intelligent structure model
in
Intelligent Structure Discovery
, an application in
Data Integration
that uses machine learning algorithms to decipher data in structured or unstructured data files and discover the underlying structure of the data.
Intelligent Structure Discovery creates an
intelligent structure model
that represents the input file data structure. You create a mapping with a data object that uses the intelligent structure model to output actionable data sets.
After the mapping fetches data from Amazon S3 input buckets, the mapping processes the data with an
intelligent structure model
to prepare the data, and can then write the data to Amazon S3 output buckets.
The following image shows the process to fetch the data from Amazon S3 input buckets, parse and prepare the data, and then write the data to Amazon S3 output buckets. Analysts can then use the data to handle security issues and improve operations management.