Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Big Data Management
  3. Mappings
  4. Sources
  5. Targets
  6. Transformations
  7. Data Preview
  8. Cluster Workflows
  9. Profiles
  10. Monitoring
  11. Hierarchical Data Processing
  12. Hierarchical Data Processing Configuration
  13. Hierarchical Data Processing with Schema Changes
  14. Intelligent Structure Models
  15. Stateful Computing
  16. Appendix A: Connections
  17. Appendix B: Data Type Reference
  18. Appendix C: Function Reference

Sorter Transformation on the Databricks Spark Engine

Sorter Transformation on the Databricks Spark Engine

Some processing rules for the Databricks Spark engine differ from the processing rules for the Data Integration Service.

Mapping Validation

Mapping validation fails when case sensitivity is disabled.
The Data Integration Service logs a warning and ignores the Sorter transformation in the following situations:
  • There is a type mismatch in between the target and the Sorter transformation sort keys.
  • The transformation contains sort keys that are not connected to the target.
  • The Write transformation is not configured to maintain row order.
  • The transformation is not directly upstream from the Write transformation.

Null Values

The Data Integration Service treats null values as low even if you configure the transformation to treat null values as high .

Data Cache Optimization

You cannot optimize the sorter cache to store data using variable length.

Parallel Sorting

The Data Integration Service enables parallel sorting with the following restrictions:
  • The mapping does not include another transformation between the Sorter transformation and the target.
  • The data type of the sort keys does not change between the Sorter transformation and the target.
  • Each sort key in the Sorter transformation must be linked to a column in the target.

0 COMMENTS

We’d like to hear from you!