Table of Contents

Search

  1. About the Data Vault Administrator Guide
  2. Introduction to the Data Vault
  3. Data Vault Service Startup and Shutdown
  4. Data Vault Configuration
  5. Data Vault ODBC Setup
  6. Data Vault Administration
  7. Data Repartitioning
  8. Partial Data Vault Copy
  9. Archived Data Migration
  10. Data Vault Administration Tool
  11. Data Vault Logs
  12. User Account Privileges
  13. ssasql Command Line Program
  14. Data Vault Audit Log
  15. Sample Configuration Files

Data Vault Administrator Guide

Data Vault Administrator Guide

Data File Calculation

Data File Calculation

The Data Vault calculates the total number of repartitioned data files for a table based on the row count. Then, the Data Vault determines the minimum and maximum values for each repartitioned data file. You specify the row count when you repartition data.
You can configure the following row count options when you repartition data files:
Keep the same row count
Configure the same row count as the original data files to create the same number of repartitioned data files. You can view the row count for each data file in the file size report.
Decrease the row count
Configure a smaller row count to increase the number of repartitioned data files. You might want to increase the number of repartitioned data files to help improve the performance of complex queries.
For example, the system has memory issues when you run a query on data files that include a large number of rows. The query includes a complex join statement on a table that has 10 million rows in each data file. The query is slow because there are too many rows to join for each data file. You repartition the table to include 5 million rows in each repartitioned data file.
Increase the row count
Configure a higher row count to decrease the number of data files. You might want to decrease the number of data files to have fewer files on disk. For example, you want fewer files on disk to simplify your backup strategy. You have a table that users do not run queries on. The table includes a high number of data files. You repartition the table to include 75% less data files.
After the Data Vault determines the total number of repartitioned data files, the Data Vault calculates the minimum and maximum value range for each repartitioned data file. The Data Vault uses the minimum and maximum value ranges to copy data from the original data files to the repartitioned data files.

0 COMMENTS

We’d like to hear from you!