EDITORS: Sampson, Geoffrey; McCarthy, Diana TITLE: Corpus Linguistics SUBTITLE: Readings in a Widening Discipline PUBLISHER: Continuum International Publishing Group Ltd YEAR: 2006
Jonathan Clenton, Graduate School of Language and Culture, Osaka University, Japan
INTRODUCTION
This book starts by introducing just how far corpus linguistics has come since 'electronic corpus linguistics' became commonplace or 'B.C.' (before computers). Sampson and McCarthy point out how as early as the eighteenth century ''Dr Johnson based his famous English dictionary in part on a collection of over 150,000 quotations'', which they say ''was certainly a corpus of a sort.'' The editors have made significant contributions to the area of corpus linguistics and here they provide a coherent presentation of the approach enunciated in various arenas over the years. The result is a collection of articles intended as a basic source book of 'background knowledge' for students working in the field of corpus linguistics.
The 42 papers in this volume will be familiar to most who teach in the area. They include work by Fries, Francis, Aarts, Altenburg, Hanks, Biber & Finegan, Sinclair, Collins, Church, Brown, Ihalainen, Hellberg, Rissanen, Burnage & Dunlop, Leech & Fallon, Frances, Hindle & Rooth, Louw, Marcus, Kita, Briscoe & Carroll, Tent & Mugler, Charniak, Mindt, Bod & Scha, Hasund & Stenström, Carletta, Werry, Resnik & Yarowsky, Hyland & Milton, Core, McEnery, McKelvie, Pols, Bohmova & Hajicova, Sampson, Campione & Veronis, Kilgarriff, and Grabe & Post. In addition to the usual author and subject matter indices, there is a substantial glossary that students will find invaluable. The book is organised chronologically in terms of the earliest date each item was presented to the public, such as an oral presentation, or web page. Each paper begins with a short introduction by the editors, and is completed by a set of notes. These last consist of comments in order to update or clarify the texts in the section, with occasional invaluable references to web locations. The book ends with a combined bibliography comprising all the work cited by the articles in the collection (26 pp) and, significantly, a short list of relevant web-sites followed by an index.
EVALUATION
This collation of papers covers a lot of ground in over 500 pages, so a review of any reasonable length will necessarily be selective. There are numerous features that make the book easily accessible and thoroughly rewarding to read. Compendia of this kind with so many contributors are often disjointed with very little uniformity from chapter to chapter in terms of theme and style. This is not the case here. The theme and style are surprisingly consistent and the editors' introductions to each chapter contribute to the cohesive whole. There are also many cross-references between chapters, allowing the editors to build upon the foundations of other contributors' work and, therefore, eliminate redundancies. Some areas that might be considered central to corpus linguistics are missing from the contents: the readings do not include work on learner corpora, corpus-based teaching material, or how corpora can be used by language learners themselves. But the editors argue, and I agree, that compartmentalization of the volume by topic would serve to make the readings less accessible to newcomers. This collection of papers shows just how much corpus linguistics has evolved as an activity, rather than a broader guide outlining practical applications for language teaching. This is a welcome focus, and probably makes the readings a stronger collection than if they had attempted to include everything in the field.
The volume shows just how very diverse and complex corpus based research can be. The contents range from the earliest contribution, which deals with corpora used to describe the structure of English, painstakingly taken (B.C) from 250, 000 words of telephone conversation. To a later paper that challenges generative linguists' claims that we have a system of rules in our heads distilled from experience. Rather, Bod and Scha (1996) propose, human language users have a *corpus* in their heads derived from a lifetime's exposure to language. Ranging more recently to Sampson's (1999) own contribution highlighting how grammatical complexity continues throughout life and well beyond the alleged 'critical period', at around the time of puberty, supported by evidence from the CHRISTINE corpus. Such examples are useful to indicate how broad the readings' coverage is and not to show that the book consistently argues in favour of corpus based research over generative linguists' intuitions.
One should not, then, expect this book to challenge generative linguistics from the standpoint of corpus linguistic investigation. It is not a popularising work directed towards converting the world of generative linguistics to corpus based methods. That said, the editors do point out that empirical evidence reveal patterns that are actually in use quite heavily when generative linguists' intuitions suggest they are not. As such, the examples cited throughout the book provide some very concrete data and provocative arguments. Nevertheless, it appears unlikely that corpora will ever be used very widely by generative grammarians. This, in spite of the fact that some generative discussions of language have been based on corpora, and have demonstrated potential for advancing generative theory. Corpora may well yet prove to be an excellent source for verifying linguistic hypotheses.
Overall, the book is an extremely valuable resource on its own, not only for corpus linguists as a valuable reference. Those newly interested in the area will also find the volume an essential collection, not least to understand the wider field of corpus linguistics and the historical developments it has undergone. The richness of the book is the editors' vast collective experience and knowledge in presenting the development in terms of linguistic research. A strong feature of the book is the inclusion of many useful figures and tables that serve to capture the research findings in a concrete manner for the reader. This excellent book should be required reading for students and teachers involved in corpus-based research and will be generally useful to anyone who seeks a more comprehensive understanding of the resurgence of corpus-based linguistics. This is an impressive volume that demonstrates just how far the field has progressed over the last 50 years.
[The 2004 edition of this book was reviewed in http://linguistlist.org/issues/16/16-98.html --Eds.]
|