Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

New from Wiley!


We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Review of  Corpus Linguistics for Grammar

Reviewer: Sabina Tabacaru
Book Title: Corpus Linguistics for Grammar
Book Author: Christian Jones Daniel Waller
Publisher: Routledge (Taylor and Francis)
Linguistic Field(s): Applied Linguistics
Linguistic Theories
Text/Corpus Linguistics
Subject Language(s): English
Issue Number: 26.5515

Discuss this Review
Help on Posting
Reviews Editor: Helen Aristar-Dry


The volume “Corpus Linguistics for Grammar” provides a practical introduction to the use of corpus linguistics to analyze grammatical and lexico-grammatical patterns, providing evidence of language in use. It is conceived as a “how-to guide for those who are interested in using corpora to research grammar” (p. 2). It is divided into three parts and nine chapters, each section being followed by try-it-yourself exercises or/and sample exercises. Suggested answers are included at the end of the book.


Part I, entitled ‘Defining grammar and using corpora’, is divided into three chapters. Chapter 1 (‘What is a corpus? What can a corpus tell us?’) defines the concept of corpus, types, and tokens and provides descriptions of how they can be dealt with in corpus linguistics. The authors acknowledge the use of a descriptive approach, one that focuses on language in use and the rules that follow, as opposed to a prescriptive approach, one that is based on intuitive rules. They point out the fact that words cluster together “in predictable rather than random patterns” (p. 14) and that analyses of corpora will show that syntax and lexicon are not independent components of English (see also Biber et al. 1999). Although a corpus can tell a lot about the use of language, it cannot show why certain patterns happen. The authors explain that it is the role of the researcher to interpret the data and the uses of it. Chapter 2 (‘Definitions of a descriptive grammar’) defines descriptive grammar, which first looks at aspects of language in a corpus and then comments on these frequencies providing rules for the patterns used (following Biber et al. 1999; Carter and McCarthy 2006; Halliday and Matthiessen 2004, 2013; Hymes 1972, and Sinclair 1991). Seeing grammar at the word, sentence and text level is essential for corpus linguistics because it explains how language is used in context. The authors also argue on the distinction between spoken and written language, underlining that “speech has a grammar that is often distinct from writing” (p. 27). An example would be the verb “marry” that is used differently in written and spoken corpora. In a newspaper corpus, the past form “married” is most frequently used and it collocates with the verb “to be” or “to get” whereas in the spoken corpus (chat shows or news shows), it will be used mostly in the future tense (i.e., “going to get married”) and it will collocate primarily with “to get.” Chapter 3 (‘What corpora can we access and what tools can we use to analyze them?’) supplies information about the corpora that can be accessed for such analyses and the tools to analyze them. A practical table of the available corpora is provided, with names, numbers of tokens, description, and advantages of each corpus presented by the authors. Examples are provided on how to search and analyze these tools. Furthermore, the authors suggest tools that help to compile your own corpus.

Part II, entitled ‘Corpus linguistics for grammar’, is divided into three chapters dealing with different areas of investigation: frequency, chunks and colligation, and semantic prosody. Chapter 4 (‘Frequency’) examines “how often the target of a search occurs, whether this is words or structures” (p. 63). Several examples follow, providing evidence of frequency of modals or certain expressions in different kinds of corpora. The context in which these expressions are found tells more about their grammatical or lexico-grammatical patterns and functions, thus proving useful to investigate the use of language. The authors show that different texts can produce different results for the same input. In Chapter 5 (‘Chunks and colligation’), the authors explain how words cluster together in particular sequences, having their own particular grammatical properties. These phenomena can give essential information about how texts are patterned. Quantitative measures can be used to analyze how these patterns make meaning, and it is the role of the researcher to explain why these patterns occur in a certain way and with a certain frequency. Chapter 6 (‘Semantic prosody’) lays out the connotations (positive, negative, or neutral) carried by language use. Lexis and grammar are equally important to express an idea (for instance, the use of the passive voice represents a more objective tone than the active voice). Context, then, plays an essential role in grammatical choice, as is shown from data taken from spoken and written texts (academic writing). The authors conclude that “grammar is far more than simply a skeleton upon which we hand meaning via lexis” (p. 114).

Part III, entitled “Applications to research”, is divided up into three chapters as well, investigating the different research areas in which corpora can be used. Chapter 7 (‘Applications to English language teaching’) investigates the grammatical aspects that a corpus reveals in areas such as EFL, ESL, and first language learning. Comparing certain patterns in syllabi shows how frequently these sequences are used and in which type of discourse they are more often encountered so teachers can focus their attention on these aspects when teaching English. Regarding first language learning, the use of corpora can be extended to develop awareness of grammar, vocabulary, and phonology (p. 133) in primary and secondary schools. The authors also acknowledge the limitations of such requirements (time-consuming for teachers, for example) but argue in favor of corpus-informed syllabi and methodologies, adding that “it is far more desirable to have the information a corpus can provide us than not have it” (p. 136). Chapter 8 (‘Wider applications. Data driven journalism and discourse analysis’) explores frequencies at word, sentence, and text level in different kind of speeches. The authors describe the use of certain pronouns in political speeches and the writing techniques in business letters (taken from Vergaro 2005) that could account for ‘Intercultural Discourse’. They then investigate the use of “hereby” by exploring the GloWbe corpus (a 1.9-billion-word collection of web-pages from around the world) in order to examine the intercultural differences. Lastly, Chapter 9 (‘Research projects’) explores potential research projects using the techniques outlined in the book: frequency, collocations and colligation, and colligation and semantic prosody. The authors provide research questions and hypotheses from different perspectives in order to show how research can be undertaken using corpus data.

This volume also contains a list of figures, a list of tables, acknowledgements, and a list of abbreviations. The sample exercises at the end of the book are followed by a glossary and an index.


This volume provides useful insight into the use of grammar today. The sample exercises included in each chapter and section are practical for the purpose at hand. The bibliography for each section is placed at the end of each chapter instead of at the end of the book, which makes this volume useful in any (grammar) class, as students are prompted to learn more about the topics, without having to search through a full bibliography at the end of the volume.

The book is practical and accessible for students and teachers of grammar in particular and linguistics in general. Chapter 3, dealing with the use and analysis of corpora, provides short explanations and comparisons of statistical methods to investigate the data. Exercises are provided that students can try themselves (or that can be used in class by teachers). The answers at the end of the volume allow students and researchers to compare their findings to the ones given by the authors. Several potential research projects are also described in Chapter 9 where authors underline the facility of using corpus analysis tools in order to investigate language.

The arguments provided by the two authors are also seen from different perspectives, and limitations to such methodologies are often given at the end of sections. This contributes to an analytical view of corpus-informed teaching that is aware of the advantages and disadvantages of such techniques. However, the authors’ description of such tools convincingly argues for the improvements brought by corpora in language teaching.

Although the volume is accessible, the exercises at the end of each section can be seen as a drawback as they interrupt the reading. Placing them at the end of each chapter instead of inside each section would have provided a volume more suitable for students and teachers inside a classroom, dealing with one topic at a time. Nevertheless, we can see why the authors chose to place them inside the sections: many of these exercises are explained afterwards as a step forward into the arguments provided.

All in all, this volume is highly recommended to students and teachers interested in grammar and linguistics. It focuses on the advantages offered by corpus linguistics and the analytical view it provides on language teaching. Students can also learn how to make and explore their own corpora, which is a very useful tool of investigating language nowadays.


Biber, D., Johansson, S., Leech, G., Conrad, S. and Finegan, E. 1999. “Longman grammar of spoken and written English. London”: Longman.

Carter, R. and McCarthy M. 2006. “Cambridge grammar of English”. Cambridge: Cambridge University Press.

Halliday, M.A.K. and Matthiessen, C. 2004. “An introduction to functional grammar” (3rd edition). London: Routledge.

Halliday, M.A.K. and Matthiessen, C. 2013. “Halliday’s introduction to functional grammar” (4th edition). London: Routledge.

Hymes, D.H. 1972. On communicative competence, in J.B. Pride and J. Holmes (Eds.) “Sociolinguistics, selected readings”. Harmondsworth: Penguin, 269-293.

Sinclair, J. 1991. “Corpus, concordance, collocation: describing English language”. Oxford: Oxford University Press.

Vergaro, C. 2005. ‘Dear Sirs, I hope you will find this information useful’: discourse strategies in Italian and English ‘For Your Information’ (FYI) Letters. Discourse Studies 7(1), 109-135.
Sabina Tabacaru recently received her PhD from the University of Lille (France) and K.U. Leuven (Belgium). Her research interests include cognitive linguistics, gesture analysis, and grammar, applied to the study and the understanding of humor in interaction.

Format: Hardback
ISBN-13: 9780415746403
Pages: 202
Prices: U.S. $ 160.00
Format: Paperback
ISBN-13: 9780415746410
Pages: 202
Prices: U.S. $ 49.95