Table of Contents

Search

  1. Preface
  2. Introduction to Test Data Management
  3. Test Data Manager
  4. Projects
  5. Policies
  6. Data Discovery
  7. Creating a Data Subset
  8. Performing a Data Masking Operation
  9. Data Masking Techniques and Parameters
  10. Data Generation
  11. Data Generation Techniques and Parameters
  12. Working with Test Data Warehouse
  13. Analyzing Test Data with Data Coverage
  14. Plans and Workflows
  15. Monitor
  16. Reports
  17. ilmcmd
  18. tdwcmd
  19. tdwquery
  20. Data Type Reference
  21. Data Type Reference for Test Data Warehouse
  22. Data Type Reference for Hadoop
  23. Glossary

Hadoop Plan Settings

Hadoop Plan Settings

Enter source and target connections for the Hadoop plan.
The following table describes connection options:
Connection Options
Description
Source Connection
Required. A connection to the source database. Select a source connection from the list. When you create a Hadoop plan, you can select Oracle, DB2, Sybase, Microsoft SQL Server, Hive, flat file, or HDFS connections.
Target Connection
Required. When you create a Hadoop plan, you can select a relational or an HDFS target connection from the list. When you select a relational target connection type, you can select the Hive connection.
Resource Format
Required if you select the target connection as HDFS. The format of the target file. You can select the following file formats:
  • None. The target contains the HDFS file format.
  • AVRO. A data serialization system. A complex file data object for Avro data sources in the local system. The target contains the Avro file format.
  • Parquet. A complex file data object for Parquet data sources in the local system. The target contains the Parquet file format.
Truncate Tables
Truncates the table before loading it. By default, this option is selected. You can truncate the tables for Hive connections. You cannot truncate tables if you use an HDFS connection or a Blaze execution engine.
Stop on Error
Indicates how many non fatal errors the Data Integration Service encounters before it stops the mapping. If you enter zero, the mapping does not stop for non fatal errors. Default is zero.
Recover Strategy
Strategy for recovering a workflow when errors occur.
Choose one of the following recovery strategies:
  • Start from last failure. The Data Integration Service continues to run the workflow from the previous failed state.
  • Start from beginning. The Data Integration Service runs the workflow from the beginning when it recovers the workflow.
Date-time Format String
Date-time format defined in the session properties. You can enter seconds, milliseconds, microseconds, or nanoseconds.
  • Seconds. MM/DD/YYYY HH24:MI:SS
  • Milliseconds. MM/DD/YYYY HH24:MI:SS.MS
  • Microseconds. MM/DD/YYYY HH24:MI:SS.US
  • Nanoseconds. MM/DD/YYYY HH24:MI:SS.NS
Default is microseconds.
Max Parallel Sessions
The maximum number of mappings that can run at the same time.
Locale
Sets the locale for data movement and data masking operations.
Persist Mapping
Optional. Stores the mappings in the Model repository for future use.
Execution Engine
The Hadoop environment that runs the mapping. Select Blaze or Spark.