Review of  Corpora: Pragmatics and Discourse

Reviewer: Lamont D. Antieau
Book Title: Corpora: Pragmatics and Discourse
Book Author: Andreas H. Jucker Daniel Schreier Marianne Hundt
Publisher: Rodopi
Linguistic Field(s): Discourse Analysis
Text/Corpus Linguistics
Subject Language(s): English
Issue Number: 21.2695

EDITORS: Jucker, Andreas H.; Schreier, Daniel; Hundt, Marianne
TITLE: Corpora
SUBTITLE: Pragmatics and Discourse
SERIES TITLE: Language and Computers: Studies in Practical Linguistics, 68
YEAR: 2009

Lamont D. Antieau, Independent Scholar


Corpus-based approaches have been used to study many facets of language
structure, particularly in recent years; however, as editors Andreas H. Jucker,
Daniel Schreier and Marianne Hundt point out in the introduction to this volume,
''the potential of corpus linguistics has not yet been fully explored for either
discourse analysis or pragmatics'' (5). Only a relatively small number of
scholars working in pragmatics, for instance, have used large-scale corpora for
analysis, and the starting point for such studies has typically been ''either a
discourse particle with a fixed form that can easily be retrieved from a large
corpus, or a speech function that is generally realized in a small number of
variant patterns'' (4). Thus, the editors propose that ''the time is right to
encourage and promote more systematic cooperation between researchers
investigating pragmatics and discourse on the one hand and those working with
corpus-linguistic methods on the other'' (6). As a way of showcasing current
research being done along these lines, ''Corpora: Pragmatics and Discourse''
presents 22 papers from the 29th International Conference on English Language
Research on Computerized Corpora (ICAME 29), which was held in Ascona,
Switzerland, in May 2008.


The first of the volume's three sections is entitled ''Pragmatics and discourse''
and comprises 10 papers on the special topic of the conference, with half taking
a historical perspective and the other half focusing on present-day English.
It begins with two plenary papers: the first by Thomas Kohnen, which provides an
overview of research done in historical corpus pragmatics and offers suggestions
for future work, particularly in the area of speech acts, and the second by Irma
Taavitsainen, which examines the dissemination of knowledge and negotiation of
meaning across a wide range of medical texts written in the Early Modern English
period. In the next paper, Tanja Rünnen uses the Corpus of English Religious
Prose to examine changes in the use and distribution of exhortations in
religious texts from the 14th to the 17th century. Next, Minna Nevala
investigates the Corpus of Early English Correspondence to determine how the use
of the word 'friend' changed over the span of the 17th and 18th centuries in
terms of its referential range and instrumental function. Minna Palander-Collin
then examines variation and change in self-reference in the letters of gentlemen
in 16th- and 18th-century correspondence, speculating that changes in
self-reference marking might be related to changes in stance and involvement in
English after 1650.

In the first of five papers in the section that examine present-day English,
Anita Fetzer analyzes the use of 'sort of' and 'kind of' in the political
speeches and interviews of high-ranking British officials from 1990 to 2006.
Next, Karin Aijmer investigates the multifunctional phrase 'I don't know' to
determine differences in its use by native speakers and by learners. In the
following paper, Magnus Levin and Hans Lundquist investigate how the recurrent
phrases 'on the face of it', 'on its face' and 'in (the) face' serve
text-organizing functions stemming from the process of grammaticalization. Karin
Axelsson uses the British National Corpus to highlight problems in using corpus
data for research on fictional narratives and dialogue in fiction and proposes
several solutions, including the creation of annotated corpora and the use of
sampling procedures. [[Could you say something more specific here? This isn't
very informative.]] In the final paper of the section, Anna Marchi and Charlotte
Taylor use the framework of Corpus-Assisted Discourse Studies to examine how
British newspapers provide evidence of diachronic change in how the European
Union is perceived.

The second section of the volume is entitled ''Lexis, grammar and semantics'' and
comprises case studies on specific lexical, syntactic and semantic issues by
integrating corpus-based research with the approaches of pragmatics and
discourse. In the first paper of the section, Stephen Coffey uses the British
National Corpus to investigate the lexico-grammatical frame exemplified by the
phrase 'a nightmare of a trip' from various perspectives. Next, Magali Paquot
and Yves Bestgen compare the use of three statistical tests for extracting
keywords from corpora by using the log-likelihood ratio, the t-test and the
Wilcoxon-Mann-Whitney test to find significant differences between the frequency
and distribution of words in two different subcorpora of the British National
Corpus: one comprising academic texts and the other literary texts.
clear to me: How do these statistical test help extract words from a corpus? Can
you replace this with something more about the content? LA: Is this explanation
better? the general method consists of five steps that aren't easy to summarize
in this space and then each of the tests has its particulars.]] Naixing Wei
focuses on a wide variety of phraseological features found in Chinese learner
spoken English, with implications for both second language acquisition and
pedagogy. Jukka Tyrkkö and Turo Hiltunen then examine the distribution of
nominalizations in the Early Modern English Medical Texts corpus to determine
their use increased between 1500 and 1700. Arja Nurmi uses the Corpus of Early
English Correspondence to examine the social history of 'may' by tracking its
use by members of various social groups between 1400 and 1800. Sara Gesuato
uses an approach integrating both corpus-based approaches and acceptability
judgments from native speakers to investigate the semantics of 'go' followed by
infinitival verbs. Next, Carolin Biewer uses the International Corpus of
English to compare the distribution of get-passive and be-passive constructions
in Fijian English to that of other varieties of English in an effort to reveal
major influences behind Fijian English. The paper by Ingvilt Marcoe presents a
contrastive analysis of subordinating conjunctions used in religious treatises
and prayers of the Middle English and Early Modern English periods found in the
Corpus of English Religious Prose. In the last paper of the section, Daniël Van
Olmen uses the Great Britain component of the International Corpus of English
and a Northern Dutch corpus compiled from the Spoken Dutch Corpus to conduct a
contrastive analysis of imperatives in English and Dutch and also analyzes the
linguistic alternatives to imperatives that are available to speakers of both

The final section is ''Corpus compilation, fieldwork and parsing'' and focuses
specifically on methodological problems associated with integrating the study of
pragmatics and discourse with corpus-based approaches. Dagman Deuber discusses
fieldwork being conducted toward a Caribbean component of the International
Corpus of English and describes linguistic and sociolinguistic approaches for
analyzing grammatical variation in the corpus. Alpo Honkapohja, Samuli
Kaislaniemi, and Ville Marttila present the Digital Editions for Corpus
Linguistic project, the aim of which is to make historical manuscripts available
online to facilitate their use by linguistic and historical researchers. In the
last paper of the volume, Hans Martin Lehmann and Gerald Schneider use a
syntactic parser on the British National Corpus to compile a database of
syntax-lexis interactions, detailing the methodological problems presented by
such an approach.


This book is an excellent resource for anyone interested in how corpus-based
research can contribute to the study of higher-level linguistic phenomena. The
breadth of the collected papers is impressive, both in terms of their linguistic
objects of study, which vary from high-level categories, such as exhortations,
to specific constructions and words, as well as in the range of methodological
approaches that are adopted by the authors of the studies. Although its main
focus is on English, the book goes beyond the investigation of British and
American English to touch on Fijian English, Caribbean English and English as a
second language. It also strikes a nice balance between diachronic and
synchronic studies, and its coverage of the Early Modern English period in
particular should be engaging to scholars of that time period while also being
of interest to those students with questions about how English arrived at its
current state or how languages change in general. With respect to the latter,
articles in the book discuss diachronic change both in the meaning and use of
individual lexical items as well as in the overall structures of canonical texts
in specific domains.

The volume is also broad in its interdisciplinary coverage, discussing
linguistic issues in such fields as medicine, religion and politics, and as
such, it should appeal to a wider range of readers than many books on
linguistics. Given that the language of law is such a rich area of pragmatics
and discourse, and is well represented in written and spoken texts, it was
surprising to find that the book includes no studies of legal language and that,
moreover, the language of law is rarely addressed in the volume at all. This
lack, however, is made up for by the variety of subjects the volume does
encompass and the quality of the papers it comprises.

''Corpora: Pragmatics and Discourse'' is a remarkable volume, presenting studies
that should appeal to newcomers to corpus-based research in pragmatics and
discourse as well as those who have already conducted this kind of research and
want to stay informed of current approaches.

Lamont Antieau is an independent scholar investigating sociolinguistic variation in the biomedical and legal domains in an effort to improve information retrieval and automatic question answering in these areas. His primary research interests are in dialectology, typology and computer-mediated interaction.

