Repartition archived data to improve query performance and to free resources. You can also repartition data to change the number of rows in data files.
Consider data repartitioning for the following use cases:
To improve query performance
Consider data repartitioning if query performance is poor. Query performance might be poor if the query filter does not match the partition key that you used to archive the data. The optimizer cannot effectively limit the data file selection if the query filter exists in a majority of the data files. In addition, the query might consume more resources due to the number of data files that the query processes. Repartition data to reduce the number of data files involved in query processing. When you reduce the number of data files in query processing, query performance improves.
You might want to repartition data if reporting requirements changed after you archived the data. For example, you used a timestamp characteristic as the original partition key when you archived the data. Queries that filter by the timestamp characteristic perform well. One year after you archived the data, you want to use the customer ID to filter data. Queries that filter by the customer ID perform poorly. Repartition the data based on the customer ID to improve the query performance.
To change the number of records in data files
Consider data repartitioning to change the number of records in data files. When you repartition data, you can increase or decrease the number of rows. You might want to increase the number of rows to reduce the number of data files on disk. You might want to decrease the number of rows to improve the performance of complex joins.