Table of Contents

Search

  1. Preface
  2. Introduction to Test Data Management
  3. Test Data Manager
  4. Projects
  5. Policies
  6. Data Discovery
  7. Creating a Data Subset
  8. Performing a Data Masking Operation
  9. Data Masking Techniques and Parameters
  10. Plans and Workflows
  11. Monitor
  12. Reports
  13. ilmcmd
  14. Data Type Reference
  15. Data Type Reference for Hadoop

Execution Engines

Execution Engines

Use a Blaze, a Spark, or a Hive engine to run the Hadoop mappings in a workflow.
The Data Integration Service generates the Blaze, Spark, or Hive engine script based on the mapping logic, a unique identifier for the script, and the tasks that the script depends on.
You can select the execution engine at the plan level. If you select the Hive execution engine, Hive assigns the mapping job to MapReduce. If you select the Blaze execution engine, the processing is faster because Blaze uses an internal workflow compiler to run the mapping. Use a Blaze engine to improve the speed and performance of the task.
If you do not use Kerberos authentication, you can use a Blaze engine for complex file targets. In Hive inplace masking, you can use Hive or Spark execution engines.
If you use a Hive or a Blaze engine, you can use the following transformations in a mapplet rule:
  • Expression
  • Data Masking
  • Case Converter
  • Comparison
  • Decision
  • Labeler
  • Merge
  • Parser
  • Weighted Average
  • Standardizer
  • Java Passive
If you use a Spark engine, you can use the following transformations in a mapplet rule:
  • Expression
  • Data Masking
  • Java Passive
You cannot use a Blaze engine for the following options:
  • ODBC sources and ODBC dictionaries
  • Complex file target if you use Kerberos authentication
  • Truncate target table
  • Source is Hive and target is HDFS
  • Hive inplace masking
The Spark engine has the following limitations:
  • You cannot use a Spark engine when the sources are relational databases such as Oracle, Sybase, Microsoft SQL Server, and DB2 for Linux, UNIX, and Windows.
  • You cannot perform shuffle and substitution masking with a Spark engine.
  • With the Spark engine, you cannot perform data masking operations on the Binary data type in Hive.