Table of Contents


  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings
  4. Sources
  5. Targets
  6. Transformations
  7. Data Preview
  8. Cluster Workflows
  9. Profiles
  10. Monitoring
  11. Hierarchical Data Processing
  12. Hierarchical Data Processing Configuration
  13. Hierarchical Data Processing with Schema Changes
  14. Intelligent Structure Models
  15. Stateful Computing
  16. Appendix A: Connections
  17. Appendix B: Data Type Reference
  18. Appendix C: Function Reference

Cluster Workflows Overview

Cluster Workflows Overview

You can run a workflow to create a cluster that runs Mapping and other tasks on a cloud platform cluster.
Create cluster workflows to run on the Amazon EMR or Microsoft Azure HDInsight cloud platforms in the Hadoop environment. Create cluster workflows to run on the Databricks cloud platform in the Databricks environment.
The cluster workflow uses other elements that enable communication between the Data Integration Service and the cloud platform, such as a cloud provisioning configuration and a cluster connection.
A cluster workflow contains a Create Cluster task that you configure with information about the cluster to create. If you want to create an ephemeral cluster, you can include a Delete Cluster task. An ephemeral cluster is a cloud platform cluster that you create to run mappings and other tasks, and then terminate when tasks are complete. Create ephemeral clusters to save cloud platform resources.
To create a cluster on Cloudera Altus, you create a workflow with Command tasks that perform the tasks that a cluster workflow automates. For more information about creating a cluster on Cloudera Altus, see the article "Implementing Informatica Big Data Management with Ephemeral Clusters on a Cloudera Altus Cluster" on the Informatica Network.


We’d like to hear from you!