Publishing Partner: Cambridge University Press CUP Extra Publisher Login
amazon logo
More Info

New from Oxford University Press!


Oxford Handbook of Corpus Phonology

Edited by Jacques Durand, Ulrike Gut, and Gjert Kristoffersen

Offers the first detailed examination of corpus phonology and serves as a practical guide for researchers interested in compiling or using phonological corpora

New from Cambridge University Press!


The Languages of the Jews: A Sociolinguistic History

By Bernard Spolsky

A vivid commentary on Jewish survival and Jewish speech communities that will be enjoyed by the general reader, and is essential reading for students and researchers interested in the study of Middle Eastern languages, Jewish studies, and sociolinguistics.

New from Brill!


Indo-European Linguistics

New Open Access journal on Indo-European Linguistics is now available!

Email this page
E-mail this page

Review of  Corpus Linguistics and World Englishes

Reviewer: Xiaofei Lu
Book Title: Corpus Linguistics and World Englishes
Book Author: Vivian Anne de Klerk
Publisher: Bloomsbury Publishing (formerly The Continuum International Publishing Group)
Linguistic Field(s): Sociolinguistics
Text/Corpus Linguistics
Subject Language(s): English
Book Announcement: 18.1195

Discuss this Review
Help on Posting
AUTHOR: de Klerk, Vivian
TITLE: Corpus Linguistics and World Englishes
SUBTITLE: An Analysis of Xhosa English
SERIES: Research in Corpus and Discourse
PUBLISHER: Continuum
YEAR: 2006

Xiaofei Lu, Department of Linguistics and Applied Language Studies,
Pennsylvania State University


This book presents research on a variety of English spoken by the Xhosa
people in South Africa as a second language. The author takes the stand
that Black South African English (BSAE) is not a monolithic variety;
rather, the varieties of English spoken by speakers of different indigenous
languages of South Africa should be examined individually. She undertakes
the enterprise of compiling a spoken corpus of Xhosa English (XE) and
closely studies this particular English variety using the corpus. The book
consists of thirteen chapters, divided into four sections.

Section one, ''The context'', contains three chapters that provide an
overview of the socio-linguistic context of the XE corpus. Chapter one,
''Xhosa English as a World English'', describes the linguistic scene in South
Africa from a historical perspective. The author pinpoints the importance
of recognizing BSAE as a heterogeneous variety that consists of
sub-varieties used by speakers of South Africa's indigenous languages and
draws attention to the problems in defining XE as one such sub-variety due
to the dramatic differences in competence among the L2 speakers. The
chapter concludes with insights into the debate on the status of BSAE and
its sub-varieties within the nation's linguistic scene.

Chapter two, ''The need for norms: building a spoken corpus'', opens with an
explanation of the notions of endonormative vs. exonormative standards and
linguistic gatekeeping and then elaborates how large corpora can be used to
obtain empirical evidence for describing emergent norms and language
systems. The International Corpus of English approach for South African
Englishes is criticized. The choice of building a spoken instead of written
corpus of XE is justified, and the applications of such a corpus are

Chapter three, ''The structure of the Xhosa English corpus'', outlines the
design and compilation of the corpus of spoken XE. The author details
issues relating to the size and structure of the corpus, criteria for
informant selection, methodology for data collection, conventions of data
transcription and mark-up, and adjuncts to the corpus. She points out that
the analyses of the corpus do not follow any single theoretical framework,
but aim to demonstrate how a corpus can be useful to different theoretical

Section two, ''Corpus studies and sociolinguistic insights'', comprises five
chapters that investigate a range of linguistic phenomena in relation to
the social and psychological issues underlying the XE speech community.
Chapter four, ''Topic choices and lexical characteristics'', summarizes some
of the most prevalent topic selections and recurrent lexical choices in the
corpus. The author argues that these choices offer glimpses at the primary
preoccupations of the community. The chapter also describes a number of
phonological and lexical features that characterizes XE as an L2 variety of

Chapter five, ''the role of APPRAISAL resources in discussing AIDS'', reports
results of a qualitative analysis of selected conversations with respect to
how APPRAISAL (as in Systemic Functional Grammar) resources are used in
discussions about AIDS, one of the predominant discussion topics. XE
speakers are found to use significantly more evoked than inscribed
expressions of affect. The author speculates that this could be due to
either the lack of linguistic resources at the speakers' disposal or the
nature of the AIDS topic, which calls for consensus negotiation within the

Chapter six, ''Formulaic utterances'', compares the frequencies of formulaic
expressions in the XE corpus and a benchmark L1 corpus, the Wellington
Corpus of Spoken New Zealand English (Holmes 1995). The author reviews the
general functions of formulae in speech and puts forth a simple method for
extracting formulaic expressions from the corpus. She finds that, with few
exceptions, XE speakers use essentially the same formulaic expressions as
native speakers. She argues that formulae are especially useful for second
language speakers in achieving naturalness in informal speech.

Chapter seven, ''Codeswitching in the corpus'', analyzes the codeswitching
behavior in spontaneous speech among the XE speakers to assess their levels
of bilingualism. The author notes that the analysis helps one understand
issues of linguistic identity in the speech community. Three types of
codeswitching are analyzed: switching on content morphemes, switching at a
time of uncertainty, and switching that serves a communicative purpose. The
author reports reasonably good levels of bilingualism among the speakers.

Chapter eight, ''Informal conversation versus legal discourse'', compares the
patterns of lexical choice in highly specialist legal discourse with those
in the XE corpus and the New Zealand English corpus. Selected legal
vocabulary and nominalizations are scrutinized. The author identifies
notable differences between the lexis of legal presentations and informal
speech and argues that these differences lead to considerable difficulty
for non-specialists to cope with legal language.

Section three, ''Corpus studies and linguistic description'', describes
selected formal linguistic aspects of the XE corpus. Chapter nine, ''The
syntactic features of Xhosa English'', illustrates some of the most
characteristic syntactic features of XE, such as extended use of the
progressive, overgeneralization in the use of quantifiers, and 'can be
able' as a modal verb. However, the author comes to the conclusion that no
overwhelming evidence exists for profuse emergent norms in the corpus.

Chapter ten, ''The use of discourse markers: the case of 'actually''',
provides an in-depth analysis of the use of the discourse marker 'actually'
in the corpus. The author points out that despite the crucial role of
discourse markers in conversations, they are unduly neglected in the
language classroom. The functions of 'actually' as a discourse or
propositional modifier as well as its placement are examined. The author
claims that XE speakers use 'actually' in much the same way as English
mother-tongue speakers.

Chapter eleven, ''Procedural meanings of 'well' in the corpus'', focuses on
the contextualized uses of the discourse marker 'well' in the corpus. The
author argues for the existence of a unified context-free 'core' meaning of
the marker but also identifies several loose categories of procedural
meaning within the core, e.g., 'well' as a marker of discourse coherence
and 'well' as a signal of turn change, etc.

Chapter twelve, ''Expressing levels of intensity in Xhosa English'',
describes the use of intensifiers in the corpus, focusing in particular on
those accompanying gradable adverbs and adjectives that allow comparison
and modification. The author finds that intensifiers used by XE speakers
are limited in range and that they often serve as place-fillers rather than
to strengthen assertions.

Chapter thirteen, ''The future of Xhosa English: social and educational
issues'', makes up the fourth and final section, ''Looking ahead''. The author
raises the question of the future role of XE, or English in general, as
opposed to that of the indigenous languages in South Africa and recommends
a series of directions for further research.


This book should be of great interest to students and researchers in world
Englishes, language variation, corpus linguistics, and applied linguistics.
One of the major assets of the book is the author's consideration of the
differences between the indigenous languages in South Africa and her
definition of XE, instead of just BSAE, as a world English. This definition
gives the author the opportunity to closely examine a coherent linguistic
community. Another major asset of the book is undoubtedly the use of a
large corpus of authentic spoken data for describing the language system.
The corpus itself constitutes an invaluable linguistic resource for
descriptive studies of or comparative studies involving XE.

The range of linguistic phenomena examined is fairly broad, covering a host
of phonological, lexical, syntactic, and discourse features. It is
especially noteworthy that the author makes it a point to interpret the
results in relation to the acquisitional context whereby XE is learned by
its speakers as well as to the pedagogical challenges posed by such a
context. The book also offers inspiring discussions on issues of linguistic
identity and language attitude in the speech community.

The author carefully justifies the theoretical, methodological and
presentational choices made in the book. Nevertheless, some of the choices
are open to discussion. The lack of a coherent theoretical framework is
both a merit and a limitation. On the one hand, the analyses presented in
the book showcase how a corpus can be of use to different theoretical
perspectives in different ways. On the other hand, these analyses may
appear fragmented to readers seeking a coherent, systematic description of
the language system.

Methodologically, the New Zealand English corpus is used as a primary
benchmark L1 corpus for comparative purposes. The author argues that this
is the case because of the absence of a spoken corpus of BSAE and the link
between the two varieties of English. However, one wonders why other
significant larger spoken corpora of L1 English are not used for
comparison. If Xhosa English is defined as a World English, it would appear
worthwhile to examine the characteristic features of this variety through
comparison with a larger and more representative normative corpus than the
one that is chosen.

Another methodological issue I wish to raise is the limitation of the
computational tool used in the research. The corpus is in its raw form and
is analyzed using WordSmith Tools 3.1 alone. As a result, most of the
analyses reported involve examining the frequency and distribution of
(pre-defined clusters of) individual lexical or phrasal items only. It is
felt that the author could have taken advantage of other text processing
tools for corpus annotation and analysis at more diversified linguistic levels.

Finally, in terms of presentation, the author's deliberate avoidance of
formal statistics in reporting results may appear questionable to a
critical eye, especially since a large chunk of the analysis involves
comparing various kinds of frequency information across two or more corpora.

All in all, this book is beautifully written, well-structured, and
extremely accessible. It is an exemplary work for students interested in
pursuing corpus-based language studies and a valuable resource for
researchers interested in studying BSAE and XE as world Englishes.


Holmes, J. (1995). The Wellington corpus of spoken New Zealand English: a
progress report. New Zealand English Newsletter, 9, 5-8.

Xiaofei Lu is currently Assistant Professor of Applied Linguistics in the
Department of Linguistics and Applied Language Studies at The Pennsylvania
State University. His research interests are primarily in computational
linguistics, corpus linguistics, and intelligent computer-assisted language

Format: Hardback
ISBN: 0826488412
ISBN-13: N/A
Pages: 240
Prices: U.K. £ 75.00