Use a Blaze or a Spark engine to run the Hadoop mappings in a workflow.
The Data Integration Service generates the Blaze or Spark engine script based on the mapping logic, a unique identifier for the script, and the tasks that the script depends on.
You can select the execution engine at the plan level. If you select the Blaze execution engine, the processing is faster because Blaze uses an internal workflow compiler to run the mapping. Use a Blaze engine to improve the speed and performance of the task.
If you do not use Kerberos authentication, you can use a Blaze engine for complex file targets. In Hive inplace masking, you can use the Spark execution engine.
If you use a Blaze engine, you can use the following transformations in a mapplet rule:
Expression
Data Masking
Case Converter
Comparison
Decision
Labeler
Merge
Parser
Weighted Average
Standardizer
Java Passive
If you use a Spark engine, you can use the following transformations in a mapplet rule:
Expression
Data Masking
Java Passive
You cannot use a Blaze engine for the following options:
ODBC sources and ODBC dictionaries
Complex file target if you use Kerberos authentication
Truncate target table
Source is Hive and target is HDFS
Hive inplace masking
The Spark engine has the following limitations:
To use a Spark engine when the sources are relational databases such as Oracle, Sybase, Microsoft SQL Server, and DB2 for Linux, UNIX, and Windows, you must use the JDBC connection type to create the connection. You cannot use the other connection types.
You cannot perform shuffle and substitution masking with a Spark engine.
With the Spark engine, you cannot perform data masking operations on the Binary data type in Hive.