Candidates are determined by the data retention policy. For example, a customer may decide to archive everything older than 3 years. So, candidate generation will identify all transactions older than 3 years. These are "candidates" for archive/purge.
Once the candidates have been identified, the next step is to determine if a candidate can be archived and purged - that is, does the candidate pass the business rule(s) for archiving and purging. The end result of Candidate Generation is that it provides a list of transactions that can be archived and purged. This list is used by the engine to archive (copy to stage) and purge (delete from source) the data.
The power of candidate generation is that it tells us WHY a transaction cannot be archived and purged - that is, it tells us what business rules a transaction failed. This helps us in a few ways. In particular we focus on rules where a majority of transactions failed. For example, if 80% of invoices failed the INVOICE IS UNPAID rule, then we can surmise:
Maybe this data was converted incorrectly.
Maybe the customer doesn’t PAY invoices in this system.
Maybe a patch was applied that adversely affected the PAYMENT functionality.
Maybe the PAYMENT functionality was implemented incorrectly for this timeframe and not corrected until later.
Identifying these types of causes allows us to MODIFY the rule in lieu of modifying the data. Our preference is to never modify data to pass a business rule.