Table of Contents

Search

  1. Preface
  2. Introduction
  3. The Design Issues
  4. Standard Population Choices
  5. Parsing, Standardization and Cleaning
  6. Customer Identification Systems
  7. Fraud and Intelligence Systems
  8. Marketing Systems
  9. Simple Search
  10. Summary

Application and Database Design Guide

Application and Database Design Guide

The Value/Weakness of Parsing

The Value/Weakness of Parsing

Parsing of names and addresses analyzes them in an attempt to identify and attribute each token (initial, word or code).
Parsing relies on rules about token position, format and context. Punctuation and structure, if available, can assist. For some attributes, dictionaries are helpful.
The reliability of parsing is weakened because:
  • names and addresses can be, and are, successfully used by people without following the rules;
  • the rules are sensitive to spelling error; the rules differ from country to country;
  • there is often ambiguity in the tokens;
  • naming dictionaries are incomplete.
If the goal of parsing is simply to satisfy theoretical need, there is no direct benefit to the user of the data. The best format for real world usage of names and addresses is in the form of an addressee on an envelope.
Search and matching systems do not need to rely on parsing to build search keys or match codes - there are more reliable methods. Critical search systems should never rely on parsing.
On the other hand, gross parsing of addresses, for example, splitting an address into a "fine" component (for example, parts up to but not including town name) and a "coarse" component (for example, from town name to end), can be useful by reducing the noise returned in a search.
Selective parsing is also a viable solution for a number of less critical business functions, for example, analyzing a name to discover the most useful word to use in letter personalization, analyzing an address to discover candidate town names for searching a reference table and applying statistics.

0 COMMENTS

We’d like to hear from you!