* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 17.1380

Fri May 05 2006

Diss: Text/Corpus Ling: Sepp: 'Phonological Constrai...'

Editor for this issue: Meredith Valant <meredithlinguistlist.org>


To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
Directory
        1.    Mary Sepp, Phonological Constraints and Free Variation in Compounding: A corpus study of English and Estonian noun compounds


Message 1: Phonological Constraints and Free Variation in Compounding: A corpus study of English and Estonian noun compounds
Date: 03-May-2006
From: Mary Sepp <mmseppyahoo.com>
Subject: Phonological Constraints and Free Variation in Compounding: A corpus study of English and Estonian noun compounds


Institution: City University of New York
Program: Linguistics Program
Dissertation Status: Completed
Degree Date: 2006

Author: Mary Sepp

Dissertation Title: Phonological Constraints and Free Variation in Compounding: A corpus study of English and Estonian noun compounds

Linguistic Field(s): Text/Corpus Linguistics

Subject Language(s): English (eng)

Dissertation Director:
Martin Chodorow

Dissertation Abstract:

This research was designed to examine the patterns of variation in the
phonological and/or orthographic form of Estonian and English noun
compounds. Estonian noun compounds generally occur in one of two forms:
N1(nominative)+ N2 , as in kool + meister ("schoolmaster"), or N1(genitive)
+ N2, as in kooli + ├Ápetaja ("schoolteacher"). Some Estonian compounds
vary freely in form - e.g., veebsepp/veebisepp ("webmaster"). English noun
compounds exhibit orthographic variation, as they may be written in three
ways: closed ("bookstore"), hyphenated ("dot-com"), or open ("space
station"). Many English compounds also vary freely - e.g.,
cellphone/cell-phone/cell phone. The principal goal of this study was to
use statistical data derived from corpora to determine which variables best
account for the choice of variant compound forms.

The 1,094 Estonian compounds used in this research came from a one million
word corpus of Estonian literary and news texts. Data on variation of form
were obtained from Google searches of the World Wide Web. Results showed a
strong preference for genitive forms, and it was posited that this
preference is due to general principles of ease of pronunciation and ease
of perception.

Phonology is also a factor in the distribution of English compounds. A
number of phonological variables were examined in the current study: number
of syllables, presence of compound stress, vowel sequences across internal
lexical boundaries, and double consonants across internal lexical
boundaries. Frequency data for these variables were extracted from a
fourteen million word English corpus. Results of multiple regression
analyses showed that the number of syllables in the compound is a stronger
predictor of orthographic form than the other phonological features that
were tested. Phonology was not assumed to be the only influence, however;
lexical features were also examined. Results indicated a substantial
contribution of the second constituent in predicting whether the compound
would be open or closed, and a lesser, though important, contribution of
the first constituent. A regression analysis combining phonological and
lexical variables accounted for about 68% of the variance in the
orthography of 707 high frequency English noun compounds.



Respond to list|Read more issues|LINGUIST home page|Top of issue




Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.