|Title:||Robust Processing for Constraint-based Grammar Formalisms||Add Dissertation|
|Author:||Frederik Fouvry||Update Dissertation|
|Email:||click here to access email|
|Institution:||University of Essex, BA in English Language & Linguistics Schemes|
|Linguistic Subfield(s):||Computational Linguistics;|
|Abstract:||This thesis addresses the issue of how Natural Language Processing (NLP) systems using 'constraint-based' grammar. Formalisms can be made 'robust,' i.e. able to deal with input which is in some way ill-formed or extra-grammatical. In NLP systems which use constraint-based grammars the operation of 'unification' typically plays a central role. Accordingly, the central concern of this thesis is to propose an approach to 'robust unification.'
The first part of the thesis underlines the importance of robustness in NLP, provides an overview of the sort of phenomena that require it, and reviews the state of the art. From this, it appears that no methods currently exist for robust processing with grammars of any real linguistic sophistication.
The class of constraint-based grammars studied here is that based on Typed Feature Logic (TFL), of which Head-Driven Phrase Structure Grammar is the instance chosen for exemplification. The formalism is described in the second part of the thesis.
Grammars based on TFL involve the notion of a 'signature,' which defines the kinds of objects ('types') assumed to exist in the grammar. Processing typically involves combining information about pieces of the input by unification. From this perspective, the need for robustness can be seen as arising because pieces of the input provide information which is inconsistent with information from other pieces of the input and/or from the grammar. The first inconsistency is tolerated --- it does not violate the grammar --- and processed using 'robust types' which are created by extending the signature to a lattice. Inconsistency with the grammar on the other hand is punished by stripping away the offending information. Weights, added to it on the basis of the grammar, also disappear, thus making the ungrammaticality measurable. The conceptual and formal apparatus for this is developed and exemplified in the third part of the dissertation.