Author: Hans Lindquist Title: Corpus Linguistics and the Description of English Series Title: Edinburgh Textbooks on the English Language Publisher: Edinburgh University Press Year: 2009
Marlies Gabriele Prinzl, Centre for Intercultural Studies, University College London
With 'Corpus Linguistics and the Description of English' Hans Lindquist offers another introductory book to corpus linguistics, but aims it specifically at ''university students of English at intermediate to advanced levels who have a certain background in grammar and linguistics, but who have not had the opportunity to use computer corpora to any great extent'' (xvi). He proposes that the book, especially certain sections of it, may also be of interest to students of literature. 'Corpus Linguistics and the Description of English' is comprised of of ten chapters. Chapters 1-5 cover the basics, introducing corpus linguistics as a discipline, discussing its methods and explaining key terms, chapters 6-10 delve into more specific and different subject matters, ranging from corpus-based metaphor studies to the applications of corpora in sociolinguistics. Readers new to corpus linguistics would therefore benefit from reading the first section of the book, but might opt to peruse only chapters relevant to their studies from the second part. That said, chapters 6-10 provide a valuable overview of the different possibilities within corpus linguistics for anyone new to the field. All chapters are set up in an identical fashion and include, in addition to a discussion of the topic covered, a chapter summary, study questions, suggestions for further reading as well as online corpus exercises on the book's supplementary webpage. The first chapter introduces corpus linguistics as a field. It is established from the beginning that the name does not so much indicate what is being studied, but the methodology that is being used. Lindquist however also notes that ''it cannot be denied that corpus linguistics is also frequently associated with a certain outlook in language'' (p. 1). Furthermore, the author emphasises that the book's focus is on transmitting the ''joy and fascination that lie in the description of the English language'' (p. 1). As other introductory books – e.g. Kennedy’s 'An Introduction to Corpus Linguistics' (1998) or Meyer’s 'English Corpus Linguistics: An Introduction' (2002) - , Lindquist commences with a historical overview of the field and lists the first corpora. All the essential basics – concordances, frequency, the distinction between corpus-based and corpus driven, different corpora types (spoken, general, specialised, historical, parallel) – are covered, with the use of dictionaries, text archives and web as corpora also being mentioned. In the second preparatory chapter, ''Counting, calculating and annotating,'' Lindquist provides further bases for corpus research, delineating quantitative and qualitative methods. He raises the indispensable question of what makes a word. A significant part of the chapter is devoted to managing and comparing frequency data by using statistical methods such as significance testing and measurement of strength of lexical association. Lindquist notes that the extent of use of such methods in the field varies greatly as language scholars have not traditionally received training in statistics and that some researchers (Gries 2006) have called for ''greater sophistication'' (p. 37) being needed in this area. Finally, the chapter also introduces students to corpus annotation. Chapter 3, ''Looking for Lexis,'' discusses the uses of corpora for lexicographers. It explores the different meanings of words through the example of 'squeeze,' concluding that ''the meaning of a word can only be ascertained by looking at the contexts in which it occurs'' (p. 57). This observation subtly hints at that ''certain outlook in language'' associated with corpus linguistics, which Lindquist already alluded to in chapter 1. As part of this discussion, several more important terms are introduced ('collocation', 'colligation', 'semantic preference,' and 'semantic prosody'), all of which are carefully defined and illustrated by examples. Lindquist moreover does not omit to mention that there is controversy about some of the terms. The chapter also considers lexical changes over time, a topic that is picked up again in chapter 9. Finally, an account of how corpus techniques can be used not only to study language as a whole but to examine how it is used by a specific writer or within a single work. As the title ''Checking collocations and colligations'' already indicates, chapter 4 more thoroughly explores two terms introduced in the previous section, focusing predominantly on collocations. Lindquist commences with a discussion on native-like fluency, stating that ''[t]he ability to combine words in the right way is the key to native-like fluency'' (p. 71). With this, collocations and their challenge to Chomskyan (generative) linguistics are put forward. Lindquist's treatment on ''collocation'' goes far back in time as he attributes the first usage to the educationalist H.E. Palmer in 1933, who defined the term as ''a succession of two or more words that must be learnt as an integral whole and not pieced together from its component parts'' (title page). The author then proceeds, more familiarly, with Firth's usage of the term collocation, which emphasises how the meaning of individual words is influenced by other words frequently occurring with it, offering a second definition: ''The more-frequent-than-average co-occurrence of two lexical items within five words of the texts'' (Krishnamurthy 2004: xiii). Lindquist observes thus that Palmer's and Firth's definitions point to different concepts, but that linguists often use ''collocation'' without making a distinction. The final general chapter, ''Finding Phrases,'' continues with language patterns by focusing on phrases, that is, ''more or less fixed strings which are used over and over again'' (p. 91). After naming some of the many terms used to refer to the phenomenon of phrases, Lindquist briefly explains John Sinclair's open choice principle and idiom principle (1991: 100), emphasising the view of language that has emerged through corpora: that there is a significant amount of linguistic repetitiveness and that language users will frequently rely on conventionalised utterances even when other possibilities exist. Most of the chapter is devoted to examining examples of idioms and recurrent phrases, allowing readers to gain insight into how and what kind of corpus research can be done. Lindquist uses both more established methods (querying different types of corpora for complete as well as incomplete phrase units and n-grams) as well as an emerging one (using Google to search for country-specific variants of ''storm in a teacup''), with the latter serving as an introduction to a method explored more thoroughly in chapter 10. As the first of the more specialised chapters, chapter 6, ''Metaphor and Metonymy,'' commences with a mention of Lakoff and Johnson's influential book 'Metaphors We Live By,' which has motivated increased interest in metaphor since its 1980 publication. Lindquist provides definitions of metaphor and the related concepts of simile and metonymy, but the chapter’s focus is really on the first term. Three different procedures for investigating metaphors are presented (starting with the source domain, starting with the target domain, starting from a manual analysis), providing students a good understanding of options in corpus-based metaphor studies. Chapter 7 illustrates the possibilities that corpora offer for studying grammar and presents different sample studies – mostly diachronic in nature and some comparing American and British usages – on pronouns, get-passives and so forth. Although most of these studies were done by other researchers, Lindquist also replicates them and provides helpful step-by-step instructions as well as critical discussion of differences in results. The next chapter, ''Male and Female,'' investigates the application of corpora in sociolinguistics, specifically in relation to gender-specific differences in language. The usefulness of corpus metadata such as a speaker's social class, educational level and age is highlighted. Lindquist immediately also notes that the availability of such data is lacking in most corpora and that possibilities for sociolinguistic research are therefore still quite limited. The chapter then looks at a number of different studies investigating gender both in terms of how men and women talk and are talked about, focusing, as in the previous section, on diachrony. After many examples exploring language change throughout corpus Linguistics and the description of English, chapter 9 is exclusively dedicated to the topic. It commences with the essential explanation of synchronic and diachronic perspectives. The focus is, of course, on the latter and a distinction is made between the two major ways to study change in language through corpora: the study of 'change in real time' and 'change in apparent time'. The difficulty of identifying causes of language change -- whether they are internal or external -- is considered, and plenty more sample studies are discussed. The most notable one of these is perhaps the study presented in section 9.4, which, instead of relying on modern and historic corpora, uses the online Oxford English Dictionary (OED) as a source for data. The final chapter ventures into territory not treated in most older, introductory books on the subject: the World Wide Web. It is an area that has only recently started to gain traction, but one, as Lindquist notes, that fills certain gaps as for some linguistic research ''standard corpora, even if they contain 100 million words or more, do not provide enough data'' (p. 187). A useful distinction is made between ''web as corpus'' (using searching engines to trawl the web as a corpus) and ''web for corpus'' (using the web as a resource to create a corpus). The chapter covers ways for using the web in corpus linguistics and includes sample studies such as Mair's 2007 research on preposition use in different regional varieties of English. While the advantages of the web for corpus linguistics (text types not found elsewhere, quantity of results, et cetera) are pointed out, Lindquist also considers drawbacks and issues (replication, biased random sampling, lack of linguistic annotation), concluding that ''an important part of corpus linguistics in the future will be web-based in one way or another'' (p. 205).
'Corpus Linguistics and the Description of English' provides an introduction to the subject that is highly accessible for university students of English at different levels. The book meets the goals it sets for itself and is very much a hands-on guide with a multitude of sample studies and clear step-by-step instructions. Exercises in every chapter allow readers to check their understanding of concepts introduced and provide them with the opportunity to actually query corpora themselves. In terms of content, the book – a slim volume of no more than 219 pages – manages to be surprisingly comprehensive, presenting a wide range of topics, including some options (OED as corpus, the web as corpus) that make it an updated introduction to a still evolving field. Commendably, linguistic as well as literary applications of specific methods are discussed. On occasion a more thorough exploration of topics would have been useful. For example, the chapter entitled “Metaphor and Metonymy” currently only serves readers interested in the former rhetorical device as not even the ‘Further Reading’ section includes any recommendations for students wanting to find out more about corpus-based studies of metonymy. However, there is little else to criticise and the suggestions following are more of a wish list: A glossary would be a welcome addition for future editions as students new to the field would surely find a checklist for all the specialist terms very helpful – many terms are introduced in 'Corpus Linguistics and the Description of English' and this can quickly feel overwhelming. A key to exercises should also always be included, even when the tasks set are as straightforward as in this volume. The companion website of the book (http://www.euppublishing.com/series/ETOTELAdvanced/Lindquist) is already a wonderful supplement to the book, but more thorough use could be made of it: All web-based resources mentioned by Lindquist should be listed there, including the currently only vaguely referenced online tutorials for statistics. With print editions, listings of online resources can be problematic as links quickly become outdated or inactive, however, the supplementary website for this book means that it should be fairly easy to keep recommendations current. All things considered, Lindquist’s 'Corpus Linguistics and the Description of English' is an excellent book for anyone wishing to become acquainted with corpus linguistics and its wide range of applications. As it can be read either from cover to cover or perused selectively, it is suitable for many types of readers. Without doubt, the book will be appreciated by individuals with little or even no background in the subject and, because succeeds at transmitting that “joy and fascination” (p. 1) and provides plenty of ideas for new projects, it is bound to inspire students to further explore the field in exactly the way that suits them best.
Gries, Stefan Th. (2006) ''Some proposals towards a more rigorous corpus linguistics'', Zeitschrift für Anglistik und Amerikanistik, 54 (2), 191-202.
Kennedy, Graeme (1998) An Introduction to Corpus Linguistics. London, New York: Longman.
Krishnamurthy, Ramesh (2004) “Editor’s Preface,” in John Sinclair, Susan Jones and Robert Daley, English Collocation Studies: The OSTI Report. London: Continuum, xii-xv.
Lakoff, George and Mark Johnson (1980). Metaphors We Live By. Chicago, London: University of Chicago Press.
Mair, Christian (2007) ''Change and variation in present-day English: Integrating the analysis of closed corpora and web-based monitoring'' in Marianne Hundt, Nadja Nesselhauf and Carolin Biever (eds.), Corpus Linguistics and the Web. Amsterdam: Rodopi, 233-247.
Meyer, Charles F. (2002) English Corpus Linguistics: An Introduction. Cambridge: Cambridge University Press.
Palmer, Harold E.  (1966). Second Interim Report on English Collocations. Tokyo: Kaitakusha.
Sinclair, John (1991). Corpus Concordance Collocation. Oxford: Oxford University Press.
ABOUT THE REVIEWER:
ABOUT THE REVIEWER
Marlies Gabriele Prinzl is an MPhil/PhD candidate at the Centre for
Intercultural Studies at University College London, UK. Her research
interests include corpus linguistics, translation studies and comparative