When one sets out to develop an algorithm that solves the problems previously described, one encounters a whole class of new project management and testing problems.
To establish test data to test volume performance for on-line search is difficult let alone expensive.
To establish test data to allow one to examine algorithm performance requires a representative set from a real population. The data one is interested in could be as low as 0.1% of a real population. Identifying it is nearly impossible.
The absence of objective criteria for deciding if a change to an algorithm is right or wrong leads to empirical testing only. This means that simulation testing is necessary across the whole population of names. It is no good testing test cases or problem cases because every change to the algorithm introduces both benefits and disadvantages. The only process for deciding to accept the change is to measure the net gain in benefit in real use on a real population. The extreme skew distribution of names coupled with the high degree of refinement being sought leads to one discarding sampling even when working on very large populations.
The relative significance of problems, with an algorithm, change with volume. (An example would be the barrier one goes through when a set of candidates no longer normally fits on one screen in a dialogue).
The preoccupation of the designer, programmer, and users with "special cases" leads to an enormous waste of time.
The fact that the algorithm needs to work on different populations of users in one organization can confuse the decision making.
The reluctance of users to accept new algorithms that give ’better answers’ in the majority of cases but ’not the same’ answers in the minority.
We have encountered users with an algorithm with 92% reliability and 2% selectivity refuse one with 98% reliability and 0.1% selectivity because it did not give the same answers in a parallel run.