Table of Contents

Search

  1. Preface
  2. Introduction to Mass Ingestion
  3. Prepare
  4. Create
  5. Deploy
  6. Run
  7. Monitor
  8. infacmd mi Command Reference

Mass Ingestion Guide

Mass Ingestion Guide

Use Case

Use Case

You work in an IT group for a commercial bank. Your team is taking on a new project to personalize the rewards program that you offer to customers who open checking and savings accounts at your bank.
You plan to collect and analyze data on your customers to understand the types of rewards that customers are interested in. For example, one customer might be interested in saving money on groceries while another customer might be interested in travel deals.
You collect data on customer demographics, lifestyle metrics, income, transaction history, spending habits, online presence, interests, opinions, and brand knowledge.
You collect the data through different media, including customer logs on file with the bank, point-of-sale systems at companies that the bank partners with, social media interactions, and customer weblogs.
The following image shows the types of data that you collect and the media that you use to collect the different types of data:
This image displays icons that depict how the different types of data can be measured through the following: customer logs on file with the bank, point-of-sale systems, social media interactions, and customer weblogs.
When the data is collected, the data is stored in the bank's corporate data center which includes different relational databases.
The following image shows how the data is stored in the bank's corporate data center and the databases that the bank might use to store the data:
This image shows the different types of data pointing to the corporate data center whether the data are stored.
Before your data analysts can begin working with the data, you need to ingest the data from the relational databases into Amazon S3 buckets. But you cannot spend the time and resources required to ingest the large amounts of data. You will have to develop numerous mappings and parameter sets to ingest the data to make sure that the data is ingested properly. You also have to make sure that you do not ingest sensitive customer information such as customer credit card numbers. You then have to maintain the mappings when relational schemas change.
Instead of manually creating and running mappings, you can use mass ingestion. You create one mass ingestion specification that ingests all of the data at once. You have to specify only the source, target, and the parameters that you want to configure. You deploy the specification and all of the data is ingested at once.
The following image shows how mass ingestion can branch the link between the data that the bank stores in its relational databases and the Amazon S3 buckets:
This image shows mass ingestion in between the corporate data center and the Amazon S3 input buckets to show that mass ingestion can be used to ingest data from the corporate data center to the S3 buckets.
After the data is ingested into Amazon S3 input buckets, you use Big Data Management™ to prepare the data and write the data to Amazon S3 output buckets.
The following image shows where you can use Big Data Management™ to prepare and write the data to Amazon S3 output buckets:
This image shows an arrow between the Amazon S3 input buckets and the Amazon S3 output buckets that are in a data lake.
Mass ingestion saved you a lot of time and resources and your data analysts have much more time to analyze the data and develop a new system for the bank's rewards program.

0 COMMENTS

We’d like to hear from you!