LINGUIST List 14.694

Tue Mar 11 2003

Diss: Computational Ling: Fouvry "Robust..."

Editor for this issue: Anita Yahui Huang <anitalinguistlist.org>


Directory

  • fouvry, Computational Ling: Fouvry "Robust Processing..."

    Message 1: Computational Ling: Fouvry "Robust Processing..."

    Date: Mon, 10 Mar 2003 12:02:35 +0000
    From: fouvry <fouvrycoli.uni-sb.de>
    Subject: Computational Ling: Fouvry "Robust Processing..."


    Institution: University of Essex Program: Department of Language and Linguistics Dissertation Status: Completed Degree Date: 2003

    Author: Frederik Fouvry

    Dissertation Title: Robust Processing for Constraint-based Grammar Formalisms

    Linguistic Field: Computational Linguistics

    Dissertation Director 1: Doug J Arnold

    Dissertation Abstract:

    This thesis addresses the issue of how Natural Language Processing (NLP) systems using "constraint-based" grammar. Formalisms can be made "robust," i.e. able to deal with input which is in some way ill-formed or extra-grammatical. In NLP systems which use constraint-based grammars the operation of "unification" typically plays a central role. Accordingly, the central concern of this thesis is to propose an approach to "robust unification."

    The first part of the thesis underlines the importance of robustness in NLP, provides an overview of the sort of phenomena that require it, and reviews the state of the art. From this, it appears that no methods currently exist for robust processing with grammars of any real linguistic sophistication.

    The class of constraint-based grammars studied here is that based on Typed Feature Logic (TFL), of which Head-Driven Phrase Structure Grammar is the instance chosen for exemplification. The formalism is described in the second part of the thesis.

    Grammars based on TFL involve the notion of a "signature," which defines the kinds of objects ("types") assumed to exist in the grammar. Processing typically involves combining information about pieces of the input by unification. From this perspective, the need for robustness can be seen as arising because pieces of the input provide information which is inconsistent with information from other pieces of the input and/or from the grammar. The first inconsistency is tolerated --- it does not violate the grammar --- and processed using "robust types" which are created by extending the signature to a lattice. Inconsistency with the grammar on the other hand is punished by stripping away the offending information. Weights, added to it on the basis of the grammar, also disappear, thus making the ungrammaticality measurable. The conceptual and formal apparatus for this is developed and exemplified in the third part of the dissertation.