Table of Contents

Search

  1. Preface
  2. Introduction to Mass Ingestion
  3. Prepare
  4. Create
  5. Deploy
  6. Run
  7. Monitor
  8. infacmd mi Command Reference

Mass Ingestion Guide

Mass Ingestion Guide

Process

Process

The mass ingestion process incorporates the components within the mass ingestion architecture that create, deploy, run, and monitor a mass ingestion specification.
The mass ingestion process includes the following tasks:
Create
You create a mass ingestion specification in the Mass Ingestion tool. The Mass Ingestion Service validates and stores the specification in a Model repository.
After you create the specification, you can migrate the specification between Model repositories.
Deploy
You deploy the mass ingestion specification to a Data Integration Service and specify a Hadoop connection. The Mass Ingestion Service processes and deploys the specification to the Data Integration Service.
You can also deploy the mass ingestion specification to an application archive file to save the information about the specification as an application. If you deploy the specification to an application archive file, you can import the application to the Model repository. You can deploy the application to a Data Integration Service.
Run
You run the mass ingestion specification to ingest data to Hive or HDFS. The Mass Ingestion Service schedules the specification to run. The Data Integration Service connects to the Hadoop environment. In the Hadoop environment, the Blaze, Spark, and Hive engines ingest the data to the target.
Monitor
The Mass Ingestion Service generates ingestion job statistics. You can monitor the statistics in the Mass Ingestion tool.
You can also monitor the statistics in the Administrator tool to monitor the application and mappings that perform the ingestion job.
The following diagram illustrates the detailed mass ingestion process when you create, deploy, run, and monitor a mass ingestion specification:
This diagram describes the mass ingestion process. The relevant components are listed in the columns: the Mass Ingestion tool, the Mass Ingestion Service, and the Data Integration Service. The steps in the process are listed in the rows: Create, Deploy, Run, and Monitor. The cells in each row describe how each component of mass ingestion fulfills its role in the mass ingestion specification creation, deployment, run, and monitoring steps.

0 COMMENTS

We’d like to hear from you!