Table of Contents

Search

  1. Preface
  2. Introduction to Test Data Management
  3. Test Data Manager
  4. Projects
  5. Policies
  6. Data Discovery
  7. Creating a Data Subset
  8. Performing a Data Masking Operation
  9. Data Masking Techniques and Parameters
  10. Data Generation
  11. Data Generation Techniques and Parameters
  12. Working with Test Data Warehouse
  13. Analyzing Test Data with Data Coverage
  14. Plans and Workflows
  15. Monitor
  16. Reports
  17. ilmcmd
  18. tdwcmd
  19. tdwquery
  20. Data Type Reference
  21. Data Type Reference for Test Data Warehouse
  22. Data Type Reference for Hadoop
  23. Glossary

Execution Engines

Execution Engines

Use a Blaze, a Spark, or a Hive engine to run the Hadoop mappings in a workflow.
The Data Integration Service generates the Blaze, Spark, or Hive engine script based on the mapping logic, a unique identifier for the script, and the tasks that the script depends on.
You can select the execution engine at the plan level. If you select the Hive execution engine, Hive assigns the mapping job to MapReduce. If you select the Blaze execution engine, the processing is faster because Blaze uses an internal workflow compiler to run the mapping. Use a Blaze engine to improve the speed and performance of the task.
If you do not use Kerberos authentication, you can use a Blaze engine for complex file targets. In Hive inplace masking, you can use Hive or Spark execution engines.
If you use a Hive or a Blaze engine, you can use the following transformations in a mapplet rule:
  • Expression
  • Data Masking
  • Case Converter
  • Comparison
  • Decision
  • Labeler
  • Merge
  • Parser
  • Weighted Average
  • Standardizer
  • Java Passive
If you use a Spark engine, you can use the following transformations in a mapplet rule:
  • Expression
  • Data Masking
  • Java Passive
You cannot use a Blaze engine for the following options:
  • ODBC sources and ODBC dictionaries
  • Complex file target if you use Kerberos authentication
  • Truncate target table
  • Source is Hive and target is HDFS
  • Hive inplace masking
The Spark engine has the following limitations:
  • You cannot use a Spark engine when the sources are relational databases such as Oracle, Sybase, Microsoft SQL Server, and DB2 for Linux, UNIX, and Windows.
  • You cannot perform shuffle and substitution masking with a Spark engine.
  • With the Spark engine, you cannot perform data masking operations on the Binary data type in Hive.