Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

E-mail this page

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Dissertation Information

Title: Robust Processing for Constraint-based Grammar Formalisms Add Dissertation
Author: Frederik Fouvry Update Dissertation
Email: click here to access email
Institution: University of Essex, BA in English Language & Linguistics Schemes
Completed in: 2003
Linguistic Subfield(s): Computational Linguistics;
Director(s): Doug Arnold

Abstract: This thesis addresses the issue of how Natural Language Processing (NLP) systems using 'constraint-based' grammar. Formalisms can be made 'robust,' i.e. able to deal with input which is in some way ill-formed or extra-grammatical. In NLP systems which use constraint-based grammars the operation of 'unification' typically plays a central role. Accordingly, the central concern of this thesis is to propose an approach to 'robust unification.'

The first part of the thesis underlines the importance of robustness in NLP, provides an overview of the sort of phenomena that require it, and reviews the state of the art. From this, it appears that no methods currently exist for robust processing with grammars of any real linguistic sophistication.

The class of constraint-based grammars studied here is that based on Typed Feature Logic (TFL), of which Head-Driven Phrase Structure Grammar is the instance chosen for exemplification. The formalism is described in the second part of the thesis.

Grammars based on TFL involve the notion of a 'signature,' which defines the kinds of objects ('types') assumed to exist in the grammar. Processing typically involves combining information about pieces of the input by unification. From this perspective, the need for robustness can be seen as arising because pieces of the input provide information which is inconsistent with information from other pieces of the input and/or from the grammar. The first inconsistency is tolerated --- it does not violate the grammar --- and processed using 'robust types' which are created by extending the signature to a lattice. Inconsistency with the grammar on the other hand is punished by stripping away the offending information. Weights, added to it on the basis of the grammar, also disappear, thus making the ungrammaticality measurable. The conceptual and formal apparatus for this is developed and exemplified in the third part of the dissertation.