Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Voice Quality

By John H. Esling, Scott R. Moisik, Allison Benner, Lise Crevier-Buchman

Voice Quality "The first description of voice quality production in forty years, this book provides a new framework for its study: The Laryngeal Articulator Model. Informed by instrumental examinations of the laryngeal articulatory mechanism, it revises our understanding of articulatory postures to explain the actions, vibrations and resonances generated in the epilarynx and pharynx."

New from Oxford University Press!


Let's Talk

By David Crystal

Let's Talk "Explores the factors that motivate so many different kinds of talk and reveals the rules we use unconsciously, even in the most routine exchanges of everyday conversation."

E-mail this page

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Dissertation Information

Title: Combining Machine Readable Lexical Resources with a Principle Based Parser Add Dissertation
Author: Michael McHale Update Dissertation
Email: click here to access email
Institution: Syracuse University, Information Studies
Completed in: 1995
Linguistic Subfield(s): Computational Linguistics;
Subject Language(s): English
Director(s): Sung Myaeng

Abstract: This research was motivated by the premise that the ability to process unconstrained, natural language text would ultimately provide information retrieval (IR) with a very useful tool. To date, most syntactic based Natural Language Processing (NLP) systems that support IR have taken one of two approaches: domain independent syntactic processing; or syntactic and semantic processing in limited domains. The purpose of this research was to investigate an approach to domain independent semantic processing – the combination of a principle based parser (PBP) with a semantically enhanced machine-readable dictionary (MRD).

The parser is an implementation of Chomsky's Government-Binding (GB) theory and therefore provides complete syntactic coverage. The coverage of a parsing system is, however, ultimately a function of the size and richness of its lexicon. To provide both size and richness, the lexicon for the system was extracted from Longman's Dictionary of Contemporary English (LDOCE) and semantically enhanced using Roget’s International Thesaurus.

The research investigated: (1) the impact of using an MRD as the lexicon for a PBP; (2) the automatic extraction of thematic roles from the MRD; and (3) methods to enhance those roles using Roget's.

The results show that (1) An MRD can indeed be used with a PBP though the larger, more ambiguous lexicon requires controls in the parser to avoid producing a large forest of candidate parse trees. With such controls, the impact of the larger lexicon becomes no greater for a PBP than for a traditional phrase structure grammar (ex., ATN, APSG) dealing with lexical ambiguity. (2) LDOCE contains patterns in its definitions that can be exploited in the determination of thematic roles; a simple form of semantics. The majority of these roles were extracted using simple lexical patterns. (3) The simple thematic roles can be enhanced using semi-automatic methods. A decomposition of Roget’s hierarchy allowed for a procedural mapping of the simple thematic roles to over 1000 roles with 7 levels of abstraction. It is anticipated, but not shown here, that the enhanced roles will provide an improvement in IR capabilities over the simpler thematic roles.