Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!

ad

Voice Quality

By John H. Esling, Scott R. Moisik, Allison Benner, Lise Crevier-Buchman

Voice Quality "The first description of voice quality production in forty years, this book provides a new framework for its study: The Laryngeal Articulator Model. Informed by instrumental examinations of the laryngeal articulatory mechanism, it revises our understanding of articulatory postures to explain the actions, vibrations and resonances generated in the epilarynx and pharynx."


New from Oxford University Press!

ad

Let's Talk

By David Crystal

Let's Talk "Explores the factors that motivate so many different kinds of talk and reveals the rules we use unconsciously, even in the most routine exchanges of everyday conversation."



E-mail this page

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at https://linguistlist.org/!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at webdevlinguistlist.org***

Dissertation Information


Title: Corpus-based Parse Pruning - Applying Empirical Data to Symbolic Knowledge Add Dissertation
Author: Sonja Müller Update Dissertation
Email: click here to access email
Homepage: http://www.dadazunano.de
Institution: Saarland University, Department of Computational Linguistics and Phonetics
Completed in: 2000
Linguistic Subfield(s): Syntax;
Director(s): Hans Uszkoreit
Manfred Pinkal

Abstract: On parsing natural language, the number of syntactically ambiguous situations inevitably grows with the coverage of the grammar. Therefore, most broad-coverage applications use one or other supplementary mechanism to decide on the respective probability of several ambiguous (partial) analyses.

In this thesis, I propose corpus-based parse pruning: A database of probabilistically weighted, multi-level constituent structures is generated from a stratificational German corpus and utilized as a backbone for a broad-coverage dependency grammar (Slot Grammar).

This pruning approach yields high-quality parsing results. An extensive evaluation of the syntactic variety in the training corpus and a series of experiments on quantity and quality of the constituent structures used for pruning give further insight into the criteria that help a language model to get representative and dynamically adaptable: Corpus size, a multi-purpose annotation scheme, and a wide variety of authors.