Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Sorter Transformation on the Databricks Spark Engine

Sorter Transformation on the Databricks Spark Engine

Some processing rules for the Databricks Spark engine differ from the processing rules for the Data Integration Service.

Mapping Validation

Mapping validation fails when case sensitivity is disabled.
The Data Integration Service logs a warning and ignores the Sorter transformation in the following situations:
  • There is a type mismatch in between the target and the Sorter transformation sort keys.
  • The transformation contains sort keys that are not connected to the target.
  • The Write transformation is not configured to maintain row order.
  • The transformation is not directly upstream from the Write transformation.

Null Values

The Data Integration Service treats null values as low even if you configure the transformation to treat null values as high.

Data Cache Optimization

You cannot optimize the sorter cache to store data using variable length.

Parallel Sorting

The Data Integration Service enables parallel sorting with the following restrictions:
  • The mapping does not include another transformation between the Sorter transformation and the target.
  • The data type of the sort keys does not change between the Sorter transformation and the target.
  • Each sort key in the Sorter transformation must be linked to a column in the target.


Updated November 10, 2020