Table of Contents


  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Hadoop Environment

Hadoop Environment

You can run profiles and scorecards in the Hadoop environment on the Blaze or Spark engine.
Select a Hadoop connection to push the profile logic to the Blaze or Spark engine to run a profile. When you run a profile on the Blaze or Spark engine, the Analyst tool or Developer tool submits the profile job to the Profiling Service Module. The Profiling Service Module then breaks down the profile jobs into a set of mappings. The Data Integration Service pushes the mappings to the Hadoop environment through the Hadoop connection. The Blaze or Spark engine processes the mappings and the Data Integration Service writes the profile results to the profiling warehouse.
In the Developer tool, you can run single object profiles and multiple object profiles, and enterprise discovery profiles on the Blaze or Spark engine. In the Analyst tool, you can run column profiles, enterprise discovery profiles, and scorecards on the Blaze or Spark engine.


We’d like to hear from you!