Table of Contents

Search

  1. Preface
  2. Analyst Service
  3. Catalog Service
  4. Content Management Service
  5. Data Integration Service
  6. Data Integration Service Architecture
  7. Data Integration Service Management
  8. Data Integration Service Grid
  9. Data Integration Service REST API
  10. Data Integration Service Applications
  11. Data Privacy Management Service
  12. Enterprise Data Preparation Service
  13. Interactive Data Preparation Service
  14. Informatica Cluster Service
  15. Mass Ingestion Service
  16. Metadata Access Service
  17. Metadata Manager Service
  18. Model Repository Service
  19. PowerCenter Integration Service
  20. PowerCenter Integration Service Architecture
  21. High Availability for the PowerCenter Integration Service
  22. PowerCenter Repository Service
  23. PowerCenter Repository Management
  24. PowerExchange Listener Service
  25. PowerExchange Logger Service
  26. SAP BW Service
  27. Search Service
  28. System Services
  29. Test Data Manager Service
  30. Test Data Warehouse Service
  31. Web Services Hub
  32. Application Service Upgrade
  33. Appendix A: Application Service Databases
  34. Appendix B: Connecting to Databases from Windows
  35. Appendix C: Connecting to Databases from UNIX or Linux
  36. Appendix D: Updating the DynamicSections Parameter of a DB2 Database

Data Preview Service Module

Data Preview Service Module

The Data Preview Service Module manages requests from the Developer tool to preview source or transformation data in a mapping.
When you preview data, the Developer tool sends the request to the Data Integration Service. The Data Integration Service uses the Data Preview Service Module to determine whether to run the job in the native or non-native environment based on the preview point. The preview point is the object in a mapping that you choose to view data for.
Data preview jobs run on either the Data Integration Service or the Spark engine. The Spark engine runs the job in the following cases:
  • The preview point or any upstream transformation contains hierarchical data.
  • The preview point or any upstream transformation is a Python transformation.
  • The preview point or any upstream transformation is an Expression transformation configured for windowing.
  • The mapping contains a combination of transformations that must run on the Spark engine.
When the Spark engine runs a data preview job, the job uses either the Spark Jobserver or spark-submit scripts depending on the cluster distribution you configure. If you configure the mapping with a distribution that supports Spark Jobserver, the Data Preview Service Module uses Spark Jobserver to run preview jobs on the Spark engine. Otherwise, the Data Preview Service Module uses a spark-submit script.
For more information about supported cluster distributions, see the
Data Engineering Integration User Guide
.
When the Data Integration Service receives a preview request that uses the Spark Jobserver, the Data Preview Service Module starts the Spark Jobserver and passes the mapping to the LDTM. The LDTM generates a Spark workflow and the Spark Jobserver runs the job on the Hadoop cluster. The data preview job stages the result on the configured HDFS staging directory. The Data Integration Service passes the staged data to the Developer tool.

0 COMMENTS

We’d like to hear from you!