Table of Contents

Search

  1. Preface
  2. Introduction to Informatica Data Engineering Integration
  3. Mappings
  4. Mapping Optimization
  5. Sources
  6. Targets
  7. Transformations
  8. Python Transformation
  9. Data Preview
  10. Cluster Workflows
  11. Profiles
  12. Monitoring
  13. Hierarchical Data Processing
  14. Hierarchical Data Processing Configuration
  15. Hierarchical Data Processing with Schema Changes
  16. Intelligent Structure Models
  17. Blockchain
  18. Stateful Computing
  19. Appendix A: Connections Reference
  20. Appendix B: Data Type Reference
  21. Appendix C: Function Reference

Databricks Schema Evolution

Schema enforcement and evolution enable you to manage changes in a Databricks table schema. You can choose different strategies to manage schema changes.
Schema enforcement monitors changes to Databricks table schemas and rejects changes that do not match the target table schema. When Databricks rejects changes, it cancels the write transaction and logs an exception. If you determine that you want to incorporate new columns in the target, schema evolution enables you to add them to the target in a controlled fashion. Schema evolution is also known as schema validation.
To use schema evolution, you need to disable schema enforcement in the target Databricks workspace.
For more information about Databricks schema enforcement and evolution, see the Databricks documentation.
Use the Developer tool to apply schema evolution to a mapping and update Databricks target tables.
You can use the following strategies to enable schema evolution:
  • CREATE. When you configure mappings to use the CREATE strategy, you drop the existing target table and replace it with a target table that uses the new schema.
    The mapping drops the existing target table and all its existing data and recreates the target table with all the columns from the source and their data. Thereafter, every time the mapping runs, the mapping drops the target table with its data and recreates the target table from the source schema.
  • RETAIN using pre-SQL. When you configure mappings to use the RETAIN strategy, you use a pre-SQL statement to add a subset of the schema's changed columns to the existing target table. Any existing target data is also retained. You can parameterize the pre-SQL query, or substitute a parameter set.
  • CREATE using post-SQL. When you configure mappings to sue the CREATE strategy with a post-SQL statement, you add a subset of the schema's changed columns to the existing target table. Any existing target data is also retained. The mapping loads source data from selected new columns into a temp table, and then the Post-SQL statement merges it into the existing target table schema. You can parameterize the post-SQL query, or substitute a parameter set.
You can't use schema evolution in the following conditions:
  • When table access control is enabled.
  • With
    INSERT INTO
    or
    .write. insertInto()
    clauses in Pre-SQL or Post-SQL.

0 COMMENTS

We’d like to hear from you!