Editor for this issue: Karolina Owczarzak <karolina
linguistlist.org>
New Dissertation Abstract Institution: University of Colorado at Boulder Program: Graduate English Department Dissertation Status: Completed Degree Date: 2001 Author: Douglas Roland Dissertation Title: Verb Sense and Verb Subcategorization Probabilities Dissertation URL: http://rintintin.colorado.edu/~rolandd Linguistic Field: Text/Corpus Linguistics, Psycholinguistics, Computational Linguistics Dissertation Director 1: Daniel Jurafsky Dissertation Director 2: Lise Menn Dissertation Abstract: This dissertation investigates a variety of problems in psycholinguistics and computational linguistics caused by the differences in verb subcategorization probabilities found between various corpora and experimental data sets. For psycholinguistics, these problems include the practical problem of which frequencies to use for norming psychological experiments, as well as the more theoretical issue of which frequencies are represented in the mental lexicon and how those frequencies are learned. In computational linguistics, these problems include the decreases in the accuracy of probabilistic applications such as parsers when they are used on corpora other than the one on which they were trained. Evidence is presented showing that different senses of verbs and their corresponding differences in subcategorization, as well as inherent differences between the production of sentences in psychological norming protocols and language use in context, are important causes of the subcategorization frequency differences found between corpora. This suggests that verb subcategorization probabilities should be based on individual senses of verbs rather than the whole verb lexeme, and that 'test tube' sentences are not the same as 'wild' sentences. Hence, the influences of experimental design on verb subcategorization probabilities should be given careful consideration. This dissertation will demonstrate a model of how the relationship between verb sense and verb subcategorization can be employed to predict verb subcategorization based on the semantic context preceding the verb in corpus data. The predictions made by the model are shown to be the same as predictions made by human subjects given the same contexts.Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue