LINGUIST List 18.1195

Fri Apr 20 2007

Review: Sociolinguistics; Text/Corpus Linguistics: de Klerk (2006)

Editor for this issue: Laura Buszard-Welcher <>

Directory         1.    Xiaofei Lu, Corpus Linguistics and World Englishes

Message 1: Corpus Linguistics and World Englishes
Date: 20-Apr-2007
From: Xiaofei Lu <>
Subject: Corpus Linguistics and World Englishes

Announced at AUTHOR: de Klerk, VivianTITLE: Corpus Linguistics and World EnglishesSUBTITLE: An Analysis of Xhosa EnglishSERIES: Research in Corpus and DiscoursePUBLISHER: ContinuumYEAR: 2006

Xiaofei Lu, Department of Linguistics and Applied Language Studies,Pennsylvania State University


This book presents research on a variety of English spoken by the Xhosapeople in South Africa as a second language. The author takes the standthat Black South African English (BSAE) is not a monolithic variety;rather, the varieties of English spoken by speakers of different indigenouslanguages of South Africa should be examined individually. She undertakesthe enterprise of compiling a spoken corpus of Xhosa English (XE) andclosely studies this particular English variety using the corpus. The bookconsists of thirteen chapters, divided into four sections.

Section one, ''The context'', contains three chapters that provide anoverview of the socio-linguistic context of the XE corpus. Chapter one,''Xhosa English as a World English'', describes the linguistic scene in SouthAfrica from a historical perspective. The author pinpoints the importanceof recognizing BSAE as a heterogeneous variety that consists ofsub-varieties used by speakers of South Africa's indigenous languages anddraws attention to the problems in defining XE as one such sub-variety dueto the dramatic differences in competence among the L2 speakers. Thechapter concludes with insights into the debate on the status of BSAE andits sub-varieties within the nation's linguistic scene.

Chapter two, ''The need for norms: building a spoken corpus'', opens with anexplanation of the notions of endonormative vs. exonormative standards andlinguistic gatekeeping and then elaborates how large corpora can be used toobtain empirical evidence for describing emergent norms and languagesystems. The International Corpus of English approach for South AfricanEnglishes is criticized. The choice of building a spoken instead of writtencorpus of XE is justified, and the applications of such a corpus arediscussed.

Chapter three, ''The structure of the Xhosa English corpus'', outlines thedesign and compilation of the corpus of spoken XE. The author detailsissues relating to the size and structure of the corpus, criteria forinformant selection, methodology for data collection, conventions of datatranscription and mark-up, and adjuncts to the corpus. She points out thatthe analyses of the corpus do not follow any single theoretical framework,but aim to demonstrate how a corpus can be useful to different theoreticalperspectives.

Section two, ''Corpus studies and sociolinguistic insights'', comprises fivechapters that investigate a range of linguistic phenomena in relation tothe social and psychological issues underlying the XE speech community.Chapter four, ''Topic choices and lexical characteristics'', summarizes someof the most prevalent topic selections and recurrent lexical choices in thecorpus. The author argues that these choices offer glimpses at the primarypreoccupations of the community. The chapter also describes a number ofphonological and lexical features that characterizes XE as an L2 variety ofEnglish.

Chapter five, ''the role of APPRAISAL resources in discussing AIDS'', reportsresults of a qualitative analysis of selected conversations with respect tohow APPRAISAL (as in Systemic Functional Grammar) resources are used indiscussions about AIDS, one of the predominant discussion topics. XEspeakers are found to use significantly more evoked than inscribedexpressions of affect. The author speculates that this could be due toeither the lack of linguistic resources at the speakers' disposal or thenature of the AIDS topic, which calls for consensus negotiation within thecommunity.

Chapter six, ''Formulaic utterances'', compares the frequencies of formulaicexpressions in the XE corpus and a benchmark L1 corpus, the WellingtonCorpus of Spoken New Zealand English (Holmes 1995). The author reviews thegeneral functions of formulae in speech and puts forth a simple method forextracting formulaic expressions from the corpus. She finds that, with fewexceptions, XE speakers use essentially the same formulaic expressions asnative speakers. She argues that formulae are especially useful for secondlanguage speakers in achieving naturalness in informal speech.

Chapter seven, ''Codeswitching in the corpus'', analyzes the codeswitchingbehavior in spontaneous speech among the XE speakers to assess their levelsof bilingualism. The author notes that the analysis helps one understandissues of linguistic identity in the speech community. Three types ofcodeswitching are analyzed: switching on content morphemes, switching at atime of uncertainty, and switching that serves a communicative purpose. Theauthor reports reasonably good levels of bilingualism among the speakers.

Chapter eight, ''Informal conversation versus legal discourse'', compares thepatterns of lexical choice in highly specialist legal discourse with thosein the XE corpus and the New Zealand English corpus. Selected legalvocabulary and nominalizations are scrutinized. The author identifiesnotable differences between the lexis of legal presentations and informalspeech and argues that these differences lead to considerable difficultyfor non-specialists to cope with legal language.

Section three, ''Corpus studies and linguistic description'', describesselected formal linguistic aspects of the XE corpus. Chapter nine, ''Thesyntactic features of Xhosa English'', illustrates some of the mostcharacteristic syntactic features of XE, such as extended use of theprogressive, overgeneralization in the use of quantifiers, and 'can beable' as a modal verb. However, the author comes to the conclusion that nooverwhelming evidence exists for profuse emergent norms in the corpus.

Chapter ten, ''The use of discourse markers: the case of 'actually''',provides an in-depth analysis of the use of the discourse marker 'actually'in the corpus. The author points out that despite the crucial role ofdiscourse markers in conversations, they are unduly neglected in thelanguage classroom. The functions of 'actually' as a discourse orpropositional modifier as well as its placement are examined. The authorclaims that XE speakers use 'actually' in much the same way as Englishmother-tongue speakers.

Chapter eleven, ''Procedural meanings of 'well' in the corpus'', focuses onthe contextualized uses of the discourse marker 'well' in the corpus. Theauthor argues for the existence of a unified context-free 'core' meaning ofthe marker but also identifies several loose categories of proceduralmeaning within the core, e.g., 'well' as a marker of discourse coherenceand 'well' as a signal of turn change, etc.

Chapter twelve, ''Expressing levels of intensity in Xhosa English'',describes the use of intensifiers in the corpus, focusing in particular onthose accompanying gradable adverbs and adjectives that allow comparisonand modification. The author finds that intensifiers used by XE speakersare limited in range and that they often serve as place-fillers rather thanto strengthen assertions.

Chapter thirteen, ''The future of Xhosa English: social and educationalissues'', makes up the fourth and final section, ''Looking ahead''. The authorraises the question of the future role of XE, or English in general, asopposed to that of the indigenous languages in South Africa and recommendsa series of directions for further research.


This book should be of great interest to students and researchers in worldEnglishes, language variation, corpus linguistics, and applied linguistics.One of the major assets of the book is the author's consideration of thedifferences between the indigenous languages in South Africa and herdefinition of XE, instead of just BSAE, as a world English. This definitiongives the author the opportunity to closely examine a coherent linguisticcommunity. Another major asset of the book is undoubtedly the use of alarge corpus of authentic spoken data for describing the language system.The corpus itself constitutes an invaluable linguistic resource fordescriptive studies of or comparative studies involving XE.

The range of linguistic phenomena examined is fairly broad, covering a hostof phonological, lexical, syntactic, and discourse features. It isespecially noteworthy that the author makes it a point to interpret theresults in relation to the acquisitional context whereby XE is learned byits speakers as well as to the pedagogical challenges posed by such acontext. The book also offers inspiring discussions on issues of linguisticidentity and language attitude in the speech community.

The author carefully justifies the theoretical, methodological andpresentational choices made in the book. Nevertheless, some of the choicesare open to discussion. The lack of a coherent theoretical framework isboth a merit and a limitation. On the one hand, the analyses presented inthe book showcase how a corpus can be of use to different theoreticalperspectives in different ways. On the other hand, these analyses mayappear fragmented to readers seeking a coherent, systematic description ofthe language system.

Methodologically, the New Zealand English corpus is used as a primarybenchmark L1 corpus for comparative purposes. The author argues that thisis the case because of the absence of a spoken corpus of BSAE and the linkbetween the two varieties of English. However, one wonders why othersignificant larger spoken corpora of L1 English are not used forcomparison. If Xhosa English is defined as a World English, it would appearworthwhile to examine the characteristic features of this variety throughcomparison with a larger and more representative normative corpus than theone that is chosen.

Another methodological issue I wish to raise is the limitation of thecomputational tool used in the research. The corpus is in its raw form andis analyzed using WordSmith Tools 3.1 alone. As a result, most of theanalyses reported involve examining the frequency and distribution of(pre-defined clusters of) individual lexical or phrasal items only. It isfelt that the author could have taken advantage of other text processingtools for corpus annotation and analysis at more diversified linguistic levels.

Finally, in terms of presentation, the author's deliberate avoidance offormal statistics in reporting results may appear questionable to acritical eye, especially since a large chunk of the analysis involvescomparing various kinds of frequency information across two or more corpora.

All in all, this book is beautifully written, well-structured, andextremely accessible. It is an exemplary work for students interested inpursuing corpus-based language studies and a valuable resource forresearchers interested in studying BSAE and XE as world Englishes.


Holmes, J. (1995). The Wellington corpus of spoken New Zealand English: aprogress report. New Zealand English Newsletter, 9, 5-8.


Xiaofei Lu is currently Assistant Professor of Applied Linguistics in theDepartment of Linguistics and Applied Language Studies at The PennsylvaniaState University. His research interests are primarily in computationallinguistics, corpus linguistics, and intelligent computer-assisted languagelearning.