* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
 
E-mail this message to a friend
Title: Modelling Variation in Spoken And Written English: The multi-dimensional approach revisited
Author: David Lee
Email: click here to access email
Homepage: http://clix.to/davidlee
Degree Awarded: Lancaster University , Department of Linguistics and English Language
Degree Date: 2000
Linguistic Subfield(s): Sociolinguistics
Subject Language(s): English
Director(s): Geoffrey Leech

Abstract:

This study is partly an attempt at replicating, expanding and extending Biber's (1988) multidimensional (MD) work on variation in spoken and written English, and partly a critique. Biber's statistical, corpus-based methodology is re-worked using fresh data (a 4-million word sub-set of the British National Corpus) and essentially the same linguistic features (with additions), with a view to assessing the validity, stability and meaningfulness of his results. The overarching goal is a stringent and critical re-evaluation of the statistical methodology, factor analysis, as employed and implemented by Biber in relation to language data, and hence of the conclusions and applications of that study. A large number of different factor analyses are performed (including re-analyses of Biber's own data), varying statistical parameters (rotations, extractions, number of factors), variables (transformed variables, differently scored variables, different numbers of variables) and data sub-sets (spoken, written, random, and non-random). The results obtained show that Biber's solution for 'the English language' is not the only possible one, not even for his own data, and that variations of the procedure (in particular, the composition of the corpus data and the number of variables) can distinctly affect the number and nature of the dimensions. More specifically, the use of a smaller, statistically more secure set of variables, with high communalities and MSAs (measures of sampling adequacy), on mixed data points to a greatly reduced set of very average, basic and generally unsurprising linguistic 'dimensions' of variation in 'general English'. Such a result is in fact predictable, given the nature of the data input (heterogeneous) and the statistical technique itself (maximising variances). The implications of the various results, together with the analytical re-assessment of the methodology, suggest that a cautionary note needs to be sounded on the interpretation and application of such mixed-corpus-derived dimensions in textual studies.
Add a dissertation
Update dissertation
Page Updated: 26-Nov-2009

Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.