You work in an IT group for a commercial bank. Your team is taking on a new project to personalize the rewards program that you offer to customers who open checking and savings accounts at your bank.
You plan to collect and analyze data on your customers to understand the types of rewards that customers are interested in. For example, one customer might be interested in saving money on groceries while another customer might be interested in travel deals.
You collect data on customer demographics, lifestyle metrics, income, transaction history, spending habits, online presence, interests, opinions, and brand knowledge.
You collect the data through different media, including customer logs on file with the bank, point-of-sale systems at companies that the bank partners with, social media interactions, and customer weblogs.
The following image shows the types of data that you collect and the media that you use to collect the different types of data:
When the data is collected, the data is stored in the bank's corporate data center which includes different relational databases.
The following image shows how the data is stored in the bank's corporate data center and the databases that the bank might use to store the data:
Before your data analysts can begin working with the data, you need to ingest the data from the relational databases into Amazon S3 buckets. But you cannot spend the time and resources required to ingest the large amounts of data. You will have to develop numerous mappings and parameter sets to ingest the data to make sure that the data is ingested properly. You also have to make sure that you do not ingest sensitive customer information such as customer credit card numbers. You then have to maintain the mappings when relational schemas change.
Instead of manually creating and running mappings, you can use mass ingestion. You create one mass ingestion specification that ingests all of the data at once. You have to specify only the source, target, and the parameters that you want to configure. You deploy the specification and all of the data is ingested at once.
The following image shows how mass ingestion can branch the link between the data that the bank stores in its relational databases and the Amazon S3 buckets:
After the data is ingested into Amazon S3 input buckets, you use Big Data Management™ to prepare the data and write the data to Amazon S3 output buckets.
The following image shows where you can use Big Data Management™ to prepare and write the data to Amazon S3 output buckets:
Mass ingestion saved you a lot of time and resources and your data analysts have much more time to analyze the data and develop a new system for the bank's rewards program.