Table of Contents

Search

  1. About the Enterprise Data Preparation Administrator Guide
  2. Introduction to Enterprise Data Preparation Administration
  3. Getting Started
  4. Administration Process
  5. User Account Setup
  6. Search Configuration
  7. Roles, Privileges, and Profiles
  8. Data Asset Access and Publication Management
  9. Masking Sensitive Data
  10. Monitoring Enterprise Data Preparation
  11. Backing Up and Restoring Enterprise Data Preparation
  12. Managing the Data Lakehouse
  13. Schedule Export, Import and Publish Activities
  14. Interactive Data Preparation Service
  15. Enterprise Data Preparation Service

Enterprise Data Preparation Administrator Guide

Enterprise Data Preparation Administrator Guide

Data Preparation Process

Data Preparation Process

Enterprise Data Preparation
connects to several Hadoop services on a Hadoop cluster to read from and write to Hive tables and locations in the data lake, to write events, and to store sample preparation data.
Enterprise Data Preparation
connects to the following services in the Hadoop cluster:
When an analyst uploads data to the data lake, the
Enterprise Data Preparation Service
connects to the Hadoop Distributed File System (HDFS) to stage the data in HDFS files.
When an analyst prepares data, the
Interactive Data Preparation Service
connects to HDFS to store the sample data being prepared in worksheets to HDFS files.
When an analyst previews data, the
Enterprise Data Preparation Service
connects to the Data Integration Service and reads the first 100 rows from the mapping using the JDBC driver.
When an analyst prepares data, the
Interactive Data Preparation Service
connects to HDFS and reads sample data from the Hive table or file, and displays the data in a worksheet.
When an analyst uploads data, the
Enterprise Data Preparation Service
connects to the Data Integration Service to read the temporary data staged in HDFS. If the analyst uploads data to Hive, the application writes the data to a Hive table.
When an analyst publishes prepared data, the
Enterprise Data Preparation Service
connects to the Data Integration Service to run the converted mappings in the Hadoop environment. The Data Integration Service applies the mapping to the data in the input source, and writes the transformed data to a Hive table in the data lake.