Data Discovery Guide

10.5.6
- 10.5.7
- 10.5.2
- 10.5.1
- 10.5
- 10.4.0

Back Next

Hadoop Environment

You can choose a Blaze, or Spark option to run the profiles in the Hadoop run-time environment.

After you choose the Blaze option, you can select a Hadoop connection. The Data Integration Service pushes the profile logic to the Blaze engine on the Hadoop cluster to run profiles.

When you run a profile in the Hadoop environment, the Developer tool submits the profile jobs to the Profiling Service Module. The Profiling Service Module then breaks down the profile jobs into a set of mappings. The Data Integration Service pushes the mappings to the Blaze engine through the Hadoop connection. The Blaze engine processes the mappings and the Data Integration Service writes the profile results to the profiling warehouse.

After you choose the Spark option, you can select a Hadoop connection. The Data Integration Service pushes the profile logic to the Spark engine on the Hadoop cluster to run profiles. When you run a profile in the Hadoop environment, the Developer tool submits the profile jobs to the Profiling Service Module. The Profiling Service Module then breaks down the profile jobs into a set of mappings. The Data Integration Service pushes the mappings to the Spark engine through the Hadoop connection. The Spark engine processes the mappings and the Data Integration Service writes the profile results to the profiling warehouse.

Rename Saved Search

Table of Contents

Data Discovery Guide

Data Discovery Guide

Hadoop Environment

Hadoop Environment