Date: 02-Jan-2017
From: Sibo Chen <>
Subject: Corpus linguistics on the move
EDITOR: María José López-Couso
EDITOR: Belén Méndez-Naya
EDITOR: Paloma Núñez-Pertejo
EDITOR: Ignacio M. Palacios-Martínez
TITLE: Corpus linguistics on the move
SUBTITLE: Exploring and understanding English through corpora
SERIES TITLE: Language and Computers: Studies in Digital Linguistics
YEAR: 2016

REVIEWER: Sibo Chen, Simon Fraser University

As a relatively young sub-discipline of linguistics, corpus linguistics has experienced exponential growth since the 1960s, with more and more studies being conducted by scholars across the globe. Founded in Oslo on 12th February 1977, the International Computer Archive of Modern and Medieval English (ICAME) has been a leading organization in this field, mainly focusing on compiling and distributing English language corpora for computer processing and linguistic research (ICAME, n.d.). Each year, ICAME hosts a symposium for its members. The current volume, “Corpus Linguistics on the Move”, provides a representative sample of the papers presented at the 34th ICAME conference, held in Santiago de Compostela in May 2013. This volume offers a succinct reflection of the ICAME community’s major areas of interest, as well as current trends in corpus linguistics in general.

The volume is divided into four parts. Part I discusses the various challenges associated with corpus compilation. The three chapters in this section introduce, respectively, a corpus focusing on the standardization of English during the medieval and post-medieval period (Chapter 2, “English Urban Vernaculars, 1400–1700”), an English for Academic Purposes corpus based on student writings at the Hanken School of Economics (Chapter 3, “Creating a Corpus of Student Writing in Economics”), and two multi-genre corpora representing advanced English-as-a-Second-Language (ESL) learners from Sweden and Finland (Chapter 4, “Ongoing Changes and Advanced L2 Use of English”). The discussions throughout these chapters highlight two notable challenges in corpus compilation. First, the development of historical sociolinguistics calls for the digitization of historical archives. While many historical archives have been made available online, it remains difficult and time consuming to further transcribe them into machine-readable texts. Second, the globalization of English has led to remarkable growth in advanced ESL learners and expanding varieties within World English. Yet, both trends have not been well represented in existing corpora, and accordingly, ongoing efforts are needed for compiling specialized corpora for both research and applied purposes.

The four chapters in Part II are concerned with register variations found in academic and professional texts. Chapter 5 (“Verbs and Verbs Phrases in Advanced Dutch EFL Writing”) considers the syntactic development of advanced English as a Foreign Language (EFL) writing. By examining the acquisition of verb phrase complexity by four advanced EFL students in the Netherlands, the chapter shows that it is difficult to associate any specific feature of verb phrases with mature EFL writing, due to the enormous variation in verb phrase usage found among the EFL students’ writings. The chapter subsequently concludes that for advanced EFL learners, there remains a notable gap between passive knowledge and active control. Chapter 6 (“Discourse-Organizing Metadiscourse in Novice Academic English) explores how native and non-native English writers use meta-discursive expressions to organize their academic texts. By comparing three different groups of writers (Norwegian learners of English, novice native English writers, and expert English writers) in linguistics and business, the chapter shows some interesting variations across writer groups and disciplines. First, linguistics tends to use more meta-discursive expressions than business. Second, compared with native English writers, non-native English writers tend to use more explicit meta-discursive expressions, probably with the purpose of compensating for their possible weakness in English proficiency. Third, as the level of expertise of the writer and of the reader increase, the need for meta-discursive expressions tend to be reduced. The focus on academic writing across disciplines is continued in Chapter 7 (“Passives in Academic Writing”), in which the use of the passive voice by novice and expert writers in both hard (medicine and physics) and soft sciences (law and literary criticism) is investigated. Overall, the chapter shows that passives are more frequent in the hard than in the soft disciplines. While such tendency is replicated in student essays by novice writers, these essays have a lower rate of bare passives than published articles. The chapter argues that such difference is mainly caused by novice writers’ inadequate register awareness. Chapter 8 (“Adverbial Hapax Legomena in News Text”) shifts the analytical focus to adverbial formations in journalistic texts. Based on articles published in “The Independent” and “The Guardian” between 1984 and 2012, the chapter seeks to reveal factors conditioning the appearance of adverbial hapaxes in the journalistic register. Overall, the chapter argues that adverbial hapaxes are less likely to be formulated if their base elements are rare, grammatically irregular, and context sensitive. Adverbial hapaxes formulated from creative wordplay and non-standard dialects can be blocked by formal coinages.

In line with Part II, Part III explores grammatical features of English varieties. In Chapter 9, “English in South Africa: The Case of Past Referring Forms”, the preterite-present perfect alternation in South African English is examined. The chapter discusses the preterite in the context of South Africa’s unique linguistic ecology, showing that Black South African English privileges the preterite to align with the American English norm, whereas White South African English tends to use traditional present perfect to align with the British English norm. Chapter 10 (“A Look at Participial Constructions with Get in Hong Kong English”) compares “get + past participle” constructions in Hong Kong English with British English and Indian English. The chapter proposes a taxonomy of the “get + past participle” constructions based on their levels of passiveness (central passives, pseudo, adjectival, idiomatic, and resultative constructions). According to the chapter’s analysis, the “get + past participle” constructions are less common in Hong Kong English than in the other two English varieties. Among the studied English varieties, the “get + past participle” constructions are predominantly agentless and they are often triggered when their syntactic subjects convey given information (e.g. “these commuters are getting ripped off by the public transit system”). Another commonality shared by the studied English varieties is that in terms of semantic prosody, the “get + past participle” constructions tend to convey neutral meanings. In turn, Chapters 11 and 12 (“Who is the/a/Ø Professor at Your University” and “Clause Fragments in English Dialogue”) consider different ways of making reference in British and American English. In Chapter 11, the use of articles with role predicate in historical American English is examined, based on twentieth-century data from the Corpus of Historical American English. The chapter’s analysis focuses on five single role nouns (professor, president, governor, manager, and director) and the analysis shows that bare noun phrases are generally on the decrease with the selected nouns. By comparison, Chapter 12 investigates the different types of clause fragments (non-sentential units of discourse typically conveying propositional meaning) in spoken dialogues of British English. The chapter’s analysis of the British component of the International Corpus of English suggests that clause fragments are predominantly used for matching and extending the preceding conversations, thereby functioning as an important cohesive device in spoken British English.

Part IV contains three chapters that provide new insights into the pragmatics of spoken English. Chapter 13 (“the Expression of Directive Meaning”) discusses the variation between insubordinate if-clauses and other canonical constructions of directive expressions. The chapter’s investigation into the Diachronic Corpus of Present-Day Spoken English reveals that overall, let-imperatives are more frequently used to make directive expressions, compared with ordinary imperatives and insubordinate if-clauses. Although insubordinate if-clauses are not the most frequent type of directive expressions, their uniqueness lies in the fact that they are more indirect and less intrusive than ordinary imperatives. Chapter 14 (“Taboo Language and Wearing in Eighteenth and Nineteenth Century English”) examines bad language in the Late Modern English, based on data from the Old Bailey Corpus 1720–1913. The chapter traces diachronic changes in swearing in three domains: the frequency of swearing, its types and functions, and its representation in print. The analysis shows an increasing disapproval of bad language over the examined period, which, the chapter argues, is due to crucial socio-cultural changes in late modern England. While a higher proportion of swearing expressions in the data are used by speakers from the lower social classes, it is somewhat surprising to find that much higher rates of these expressions are used by females than by males. Finally, Chapter 15 (“the ‘Humor’ Element in Engineering Lectures across Cultures) considers one of the most challenging aspects of pragmatics: humor. The chapter seeks to account for how humor functions as a pragmatic device in three different cultural settings: the UK, Malaysia, and New Zealand. The chapter’s analysis discovers that the frequencies and functions of humor vary from culture to culture, which sheds light upon the importance of recognizing inter-cultural differences in the context of the internationalization of higher education.


The volume’s major strength lies in the diversified topics presented in the chapters, which offer an impressive glance at the current research trends in corpus linguistics. Many chapters (especially those in Part I) include detailed descriptions of how their target corpora are compiled, parsed, and annotated, and these valuable pieces of information make the volume an ideal reference for researchers considering incorporating corpus-driven approaches into their own research. Another strength of the volume is its recognition of two important trends in World English: the exponential growth of advanced ESL learners and the proliferation of English varieties. The insightful discussions on both topics throughout the volume can be particularly illuminating for scholars working on language change and variation.

Admittedly, the volume contains two notable shortcomings, which constrain the scope of its readership. First and foremost, there has been a steady growth in corpus-driven research in the field of discourse analysis (cf. Baker, 2006) and it is disappointing to see that this trend has been largely neglected in the current volume. In addition, for scholars seeking for extended discussions and interpretations, the volume’s overall descriptive writing style can be somewhat boring.

That being said, however, the volume remains an enjoyable and valuable reading if you are looking for a volume showing corpus-based ways of investigating linguistic features in historical and contemporary English.


Baker, P. (2006). Using corpora in discourse analysis. New York, NY: Continumm.
ICAME (n.d.). Retrieved from


Sibo Chen is SSHRC Vanier Doctoral Fellow in the School of Communication, Simon Fraser University. His major research interests are language and communication, critical discourse analysis, and genre theories.

