Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

New from Wiley!


We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Review of  Corpus Linguistics

Reviewer: Jon Clenton
Book Title: Corpus Linguistics
Book Author: Geoffrey Sampson Diana McCarthy
Publisher: Bloomsbury Publishing (formerly The Continuum International Publishing Group)
Linguistic Field(s): Text/Corpus Linguistics
Issue Number: 17.1312

Discuss this Review
Help on Posting
EDITORS: Sampson, Geoffrey; McCarthy, Diana
TITLE: Corpus Linguistics
SUBTITLE: Readings in a Widening Discipline
PUBLISHER: Continuum International Publishing Group Ltd
YEAR: 2006

Jonathan Clenton, Graduate School of Language and Culture, Osaka
University, Japan


This book starts by introducing just how far corpus linguistics has
come since 'electronic corpus linguistics' became commonplace
or 'B.C.' (before computers). Sampson and McCarthy point out how as
early as the eighteenth century ''Dr Johnson based his famous English
dictionary in part on a collection of over 150,000 quotations'', which
they say ''was certainly a corpus of a sort.'' The editors have made
significant contributions to the area of corpus linguistics and here they
provide a coherent presentation of the approach enunciated in
various arenas over the years. The result is a collection of articles
intended as a basic source book of 'background knowledge' for
students working in the field of corpus linguistics.

The 42 papers in this volume will be familiar to most who teach in the
area. They include work by Fries, Francis, Aarts, Altenburg, Hanks,
Biber & Finegan, Sinclair, Collins, Church, Brown, Ihalainen, Hellberg,
Rissanen, Burnage & Dunlop, Leech & Fallon, Frances, Hindle &
Rooth, Louw, Marcus, Kita, Briscoe & Carroll, Tent & Mugler,
Charniak, Mindt, Bod & Scha, Hasund & Stenström, Carletta, Werry,
Resnik & Yarowsky, Hyland & Milton, Core, McEnery, McKelvie, Pols,
Bohmova & Hajicova, Sampson, Campione & Veronis, Kilgarriff, and
Grabe & Post. In addition to the usual author and subject matter
indices, there is a substantial glossary that students will find
invaluable. The book is organised chronologically in terms of the
earliest date each item was presented to the public, such as an oral
presentation, or web page. Each paper begins with a short
introduction by the editors, and is completed by a set of notes. These
last consist of comments in order to update or clarify the texts in the
section, with occasional invaluable references to web locations. The
book ends with a combined bibliography comprising all the work cited
by the articles in the collection (26 pp) and, significantly, a short list of
relevant web-sites followed by an index.


This collation of papers covers a lot of ground in over 500 pages, so a
review of any reasonable length will necessarily be selective. There
are numerous features that make the book easily accessible and
thoroughly rewarding to read. Compendia of this kind with so many
contributors are often disjointed with very little uniformity from chapter
to chapter in terms of theme and style. This is not the case here. The
theme and style are surprisingly consistent and the editors'
introductions to each chapter contribute to the cohesive whole. There
are also many cross-references between chapters, allowing the
editors to build upon the foundations of other contributors' work and,
therefore, eliminate redundancies. Some areas that might be
considered central to corpus linguistics are missing from the contents:
the readings do not include work on learner corpora, corpus-based
teaching material, or how corpora can be used by language learners
themselves. But the editors argue, and I agree, that
compartmentalization of the volume by topic would serve to make the
readings less accessible to newcomers. This collection of papers
shows just how much corpus linguistics has evolved as an activity,
rather than a broader guide outlining practical applications for
language teaching. This is a welcome focus, and probably makes the
readings a stronger collection than if they had attempted to include
everything in the field.

The volume shows just how very diverse and complex corpus based
research can be. The contents range from the earliest contribution,
which deals with corpora used to describe the structure of English,
painstakingly taken (B.C) from 250, 000 words of telephone
conversation. To a later paper that challenges generative linguists'
claims that we have a system of rules in our heads distilled from
experience. Rather, Bod and Scha (1996) propose, human language
users have a *corpus* in their heads derived from a lifetime's
exposure to language. Ranging more recently to Sampson's (1999)
own contribution highlighting how grammatical complexity continues
throughout life and well beyond the alleged 'critical period', at around
the time of puberty, supported by evidence from the CHRISTINE
corpus. Such examples are useful to indicate how broad the readings'
coverage is and not to show that the book consistently argues in
favour of corpus based research over generative linguists' intuitions.

One should not, then, expect this book to challenge generative
linguistics from the standpoint of corpus linguistic investigation. It is not
a popularising work directed towards converting the world of
generative linguistics to corpus based methods. That said, the editors
do point out that empirical evidence reveal patterns that are actually in
use quite heavily when generative linguists' intuitions suggest they are
not. As such, the examples cited throughout the book provide some
very concrete data and provocative arguments. Nevertheless, it
appears unlikely that corpora will ever be used very widely by
generative grammarians. This, in spite of the fact that some
generative discussions of language have been based on corpora, and
have demonstrated potential for advancing generative theory.
Corpora may well yet prove to be an excellent source for verifying
linguistic hypotheses.

Overall, the book is an extremely valuable resource on its own, not
only for corpus linguists as a valuable reference. Those newly
interested in the area will also find the volume an essential collection,
not least to understand the wider field of corpus linguistics and the
historical developments it has undergone. The richness of the book is
the editors' vast collective experience and knowledge in presenting
the development in terms of linguistic research. A strong feature of the
book is the inclusion of many useful figures and tables that serve to
capture the research findings in a concrete manner for the reader.
This excellent book should be required reading for students and
teachers involved in corpus-based research and will be generally
useful to anyone who seeks a more comprehensive understanding of
the resurgence of corpus-based linguistics. This is an impressive
volume that demonstrates just how far the field has progressed over
the last 50 years.

[The 2004 edition of this book was reviewed in --Eds.]

Jonathan Clenton teaches English and corpus linguistics at Osaka
University's Graduate School of Language and Culture, Japan. His
current research focuses on developmental work on vocabulary

Format: Paperback
ISBN: 082648803X
ISBN-13: N/A
Pages: 544
Prices: U.S. $ 49.95