Table of Contents

Search

  1. Preface
  2. Introduction
  3. The Design Issues
  4. Standard Population Choices
  5. Parsing, Standardization and Cleaning
  6. Customer Identification Systems
  7. Fraud and Intelligence Systems
  8. Marketing Systems
  9. Simple Search
  10. Summary

Application and Database Design Guide

Application and Database Design Guide

File and Field Design Issues

File and Field Design Issues

The design of file and field structures to support reliable name search and matching requires a good understanding of the nature of the data.

More Than One Name Field in a table or file, Names are truly "Many to Many"

It is possible that two people or companies, or products, can have exactly the same name. It is also possible that, even ignoring error and variation, people, places and things have more than one name:
  • People have maiden names and married names;
  • People have aliases and professional names;
  • Companies have registered names, trading names and division names;
  • Places have several addresses, on two separate streets, old addresses, billing addresses, postal addresses, etc.;
  • People and places can have names in more than one language.
The relationship between a name and that which it names is quite naturally a true "many-to-many relationship".
It is not surprising that indexing these "many to many" relations requires careful design in the majority of today’s relational databases, whose constructs are limited (with some good reason) to architectures based on "one to many" relations.
The design of a record or row that contains two fields, one for "name" and one for "maiden name", or "registered name" and "trading name", may make logical business sense, but it is not good for indexing.
When we are searching for a person name, company name or address we do not know which "role" it plays. We do not know if it is a birth name, married name or maiden name, we do not know if it is a current or prior address. In order to address this problem effectively, it is necessary to have several index entries pointing to the same record. The alternative of declaring a separate index on each field or attribute is totally prohibitive from a performance point of view.
Solving this "many to many" characteristic of names leads to an additional table or file in most databases. It therefore requires that this table is maintained in sync with the main business tables.

0 COMMENTS

We’d like to hear from you!