Table of Contents


  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Cluster Workflows Platform Support

Cluster Workflows Platform Support

Cluster workflow capabilities vary depending on the environment and the cloud platform.

Hadoop Clusters

On the Azure platform, you can create an ephemeral HDInsight cluster that accesses ADLS Gen2 resources.
On the AWS platform, you can create an ephemeral Amazon EMR cluster to access S3, Redshift, and Snowflake resources.
On Cloudera Altus, you create a workflow with Command tasks that perform the tasks that a cluster workflow automates. For more information about creating a cluster on Cloudera Altus, see the article "How to Create Cloudera Altus Clusters with a Cluster Workflow" on the Informatica Documentation Portal.

Databricks Clusters

On the Azure and AWS Databricks platforms, you can configure Databricks ephemeral clusters.
Optionally, you can configure the clusters to start from sets of cached VM instances, called warm pools. These instances wait on standby in a running state for ephemeral cluster creation. You can choose to have the instances remain on standby when the ephemeral clusters are terminated. For more information, see the Databricks documentation.
Enable warm pools for ephemeral Databricks clusters in the advanced properties of the Create Cluster workflow task.


We’d like to hear from you!