* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
 
E-mail this message to a friend
Title: Automatic Name Searching in Large Data Bases of International Names
Author: John Hermansen
Email: click here to access email
Homepage: http://www.las-inc.com
Degree Awarded: Georgetown University , Department of Linguistics
Degree Date: 1985
Linguistic Subfield(s): Computational Linguistics
Director(s): John Staczek
Donald Loritz
Richard O'Brien

Abstract:

The problems associated with using a person's name to locate some information about that person confront most people daily. Even with names in other languages, from cultures unfamiliar to us (in which it may be difficult to identify the surname element used to organize most name lists) we manage to locate the information we seek-- this despite the further complication that names written in non-Roman scripts often admit a variety of transcription possibilities, with no prescriptive authority for determining that one graphemic conversion is 'correct.' This human facility for understanding variations in personal names has proven very difficult to formalize, or explicitly enumerate for purposes of having a computer perform the same function. With the universal and pressing need for such a name searching system, it is telling that no automatic method exists which adequately handles large data bases of international names, independent of any human interaction. This research explores the literature and evaluates the systems and technology currently in use or available: name-coding techniques (e.g. Soundex), hardware (the Proximity PF474 chip), and a system particularly noteworthy for its pragmatic approach to the problem (the New York State Identification and Intelligence System). The problems of variation in transcription and name structure are presented with specific analyses of Spanish, Arabic, and Chinese names. The conclusion is that the failure of attempts to create a universal system for processing all personal names stems in large part from unwarranted ethnocentric presumptions about the orthographic representation and the structure of personal names. An outline for the development of a practical name search system is provided, as are topics for further -- and much needed -- research.
Add a dissertation
Update dissertation
Page Updated: 29-Nov-2009

Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.