Table of Contents


  1. Preface
  2. Using the Designer
  3. Working with Sources
  4. Working with Flat Files
  5. Working with Targets
  6. Mappings
  7. Mapplets
  8. Mapping Parameters and Variables
  9. Working with User-Defined Functions
  10. Using the Debugger
  11. Viewing Data Lineage
  12. Comparing Objects
  13. Managing Business Components
  14. Creating Cubes and Dimensions
  15. Using the Mapping Wizards
  17. Datatype Reference
  18. Configure the Web Browser

Maintaining Star Schemas

Maintaining Star Schemas

When you design a star schema, you create a fact table for information collected over time, such as purchases or transactions. You then build separate dimension tables for related lists of information, such as inventory or shipping methods. Each dimension table has a logical primary or generated composite key to enable access to dimensional data. For example, you might use an inventory part number as a primary key, or create a composite key using a part number and a current flag.
The following image shows a star schema with four dimension tables and one fact table:
Each dimension table contains a primary key value. Each primary key is joined to a foreign key value in the fact table.
When you implement a star schema, you decide how to handle updates to the fact and dimension tables. Fact tables change regularly as new information collects. Do you need to keep all existing data in the fact table, or do you want only the most recent version or snapshot?
If you do not need any historical fact information, you can drop or truncate the existing fact table before using a new session in a workflow. To keep historical information in a fact table, you usually append the latest snapshot to the existing table, using a flag such as a load date or session number to identify the most recent snapshot.
Although dimension tables are typically static lists, most dimension tables do change over time. For example, you might need to update an inventory dimension once a month to reflect new or changed part numbers. Since these changes are smaller in magnitude compared to changes in fact tables, these dimensions are known as slowly growing or slowly changing dimensions.
Slowly growing dimensions are dimension tables that have slowing increasing dimension data, without updates to existing dimensions. You maintain slowly growing dimensions by appending new data to the existing table.
Slowly changing dimensions are dimension tables that have slowly increasing dimension data and updates to existing dimensions. When updating existing dimensions, you decide whether to keep all historical dimension data, no historical data, or just the current and previous versions of dimension data.
When you do not need historical information in a slowly growing or slowly changing dimension table, you can drop or truncate the existing table before using a new session in a workflow. However, in some cases, inserting new dimensions and updating existing dimensions can be more efficient than reloading the entire table.
When you need historical information in a dimension table, you decide how to differentiate between current and historical data in the target:
  • To keep a full history, you might version new data by:
    • Creating a version number and versioning the primary key.
    • Creating a composite key using a current version flag.
    • Creating an effective date range.
  • To keep a partial history, you might keep a current version and a previous version in a single row. You can also enter a timestamp to indicate the most recent update.

Updated June 25, 2018