Review of  Corpus Studies in Contrastive Linguistics

Reviewer: Jesús Fernández-Domínguez
Book Title: Corpus Studies in Contrastive Linguistics
Book Author: Stefania Marzo Kris Heylen Gert De Sutter
Publisher: John Benjamins
Linguistic Field(s): Text/Corpus Linguistics
Subject Language(s): Dutch
Issue Number: 24.3381

The publication under review brings together six papers from the field of corpus-based contrastive linguistics, most of which were originally presented at the 5th International Contrastive Linguistics Conference (ICLC-5) held in Leuven in 2008. The volume is made up of a brief presentation by the editors followed by the six contributions, each of which includes an abstract, endnotes, bibliographic references and appendices, where necessary. The book contains a subject index.

The volume opens with the editors’ introduction, “Developments in corpus-based contrastive linguistics”, where they offer a general overview of the history of contrastive linguistics from a corpus-based perspective, outline the scope of the book and sketch the contents of the contributions. This foreword also justifies the presence of articles with a focus on semantic and pragmatic phenomena, and presents the goals of the volume as descriptive and theoretical in nature.

The first contribution is by Dirk Noël and Timothy Colleman: “Believe-type raising-to-object and raising-to-subject verbs in English and Dutch”. This article adopts a diachronic perspective to question the status of the so-called “raising-to-subject” pattern as a passive equivalent of “raising-to-object”, which has remained productive in English but not so in Dutch. The investigation makes use of two comparable corpora, the Corpus of Late Modern English Texts (CLMET) and one compiled along the same principles for Dutch (see Noël & Colleman 2009), with data in both languages from the 17th to the 20th century. A “distinctive collexeme analysis” (Gries & Stefanowitsch 2004) of the two constructions is undertaken in order to examine collocational preferences and differences in constructional semantics. The findings from this analysis confirm that there exists a group of English verbs with a strong preference for the “nominativus cum infinitivo” pattern, and that this pattern, though structurally passive, probably does not have a passive symbolic value. Various statistical tests show that the usage of the “accusativus cum infinitivo” and “nominativus cum infinitivo” variants does not simply depend on their active/passive voice, but on other symbolic values too.

In “Contingency hedges in Dutch, French and English” Bart Defrancq and Gert De Sutter look at elements which are used to linguistically moderate somebody’s viewpoint. The data used in the paper is derived from the British National Corpus, the Valibel corpus of Belgian French and the Corpus Gesproken Nederlands (the former two with around 10 million words each, the latter with 3.5 million words). The authors start by noting that English and French have one verb to perform hedging (‘depend’ and ‘dépendre’, respectively), while three options exist in Dutch (‘afhangen’, ‘te zien zijn’ and ‘liggen’). The two central questions that this work tries to answer are, first, in which way the three Dutch units correspond to the English and French one and, second, what the relationship is between ‘afhangen’, ‘te zien zijn’ and ‘liggen’ in Dutch. After a thorough series of frequency-based analyses, the authors determine that the hedging elements under study are dependent on previously occurring discursive information, and that they always carry a conditional or causal pragmatic load. The reader will find particular value in the statistical tests that buttress and illustrate the different points that are considered here (for example, statistical significance and chi-squared distribution). A second conclusion is that these units of contingency hedging have undergone an evident process of decategorialisation: due to their frequent use as markers of intersubjectivity, they have become considerably fixed, syntactically speaking, even if the Dutch elements seem to show a more flexible behaviour than their English and French counterparts.

The next paper is entitled “Cultural differences in academic discourse: Evidence from first-person verb use in the methods sections of medical research articles”. In it, Ian A. Williams makes use of comparable corpora, in this case one composed of English texts, the other one of Spanish texts, which together amount to 500,000, words with the aim of inspecting the stylistics of first-person verbs occurring in the methodology sections of scientific articles. The work consists of a quantitative and a qualitative side. Regarding the former, the investigation shows that almost 50% of English and Spanish articles use first person self-references. As for the latter, the author pays attention to Sinclair’s (1996) four types of co-occurrence: collocation, colligation, semantic preference and semantic prosody. These mechanisms allow Williams to show that the use of such self-references is different in the two languages: while their purpose in English texts is to justify the author’s choices when designing a study or experiment, the first person in Spanish is used to bring the reader closer to the author of the work. One interesting finding of this article is the fact that translation plays an important role in how language is used in medical research articles, in such a way that articles translated from English into Spanish usually follow the stylistic trends of the L1 in aspects like the use of first person self-reference.

The next chapter is “A contrastive analysis of English and French argumentative discourse”, by Anita Fetzer and Marjut Johansson. In this case, the authors concentrate on English and French political television debates with the purpose of analysing the scope of action of the first person self-references of the verbs ‘think’ and ‘believe’, and ‘penser’ and ‘croire’. This discourse-based analysis is performed by consulting two comparable corpora comprising 29 British political interviews and 26 French political interviews, the former with c. 180,000 words, the latter with c. 119,000 words. As in Williams’ paper, we find a quantitative and a qualitative approach to the issue. The quantitative data shows that ‘I think’ is the favoured parenthetical in English, and ‘je crois’ is the preferred one for French, which does not mean, however, that they occur with an identical frequency as discourse connectives.– it is clear, as a token, that the constructions under study occur more frequently in English than in French. The qualitative-based method shows that ‘believe’ and ‘croire’ often carry a “boosting function”, while ‘think’ and ‘penser’ may carry either a boosting or an attenuating function. In addition, it is revealed that the two languages display a marked preference for the connective ‘and’/‘et’. The examinations and analyses in this chapter are interspersed with numerous examples from the two corpora, which is appreciated given the fine-grained semantic-pragmatic differences between some of the cases under discussion.

Also revolving around English and French is the article by Issa Kanté: “Mood and modality in finite noun complement clauses”. The object of this contribution is English and French finite noun complement clauses and their relationship with modality. Kanté demonstrates that modality is an intrinsic feature of the nouns that occur within ‘that’-clauses, thus acting as modal stance markers. Once a solid theoretical background is provided, the paper’s hypotheses are put forward and the data (from the BYU-BNC and the Frantext corpora) is analysed. Particularly interesting is the study of the three modality groups into which nouns are categorised: epistemic (e.g. assertion, certainty, fact), alethic (e.g. likelihood, necessity, possibility) and deontic (e.g. constraint, demand, requirement). In this section, the author uses empirical data to underpin his claim that, in both English and French, epistemic nouns tend to favour the indicative. The situation is different with alethic and deontic nouns, since both types favour the subjective in French, while alethic nouns tend to opt for the indicative and the latter choose the subjunctive. The findings, furthermore, seem to support the idea that the presence of some kind of modality is always found among the nouns governing ‘that’-clauses.

The volume closes with Aurelia Usoniene and Audrone Šoliene’s “Choice of strategies in realisations of epistemic possibility in English and Lithuanian”. This contribution pays attention to the process of formalisation of epistemic possibility in these two languages, under the assumption that English auxiliaries and adverbs will behave differently in this field of modality than Lithuanian modals and adverbs. This includes units like ‘can, could, may, might’ vs. ‘maybe, perhaps, possibly’ in English, and ‘galėti’ ‘can/could/may/might’ vs. ‘gal, galgi, galbῡt, rasi, lyg ir’ ‘maybe/perhaps/possibly’ in Lithuanian. The study makes use of two comparable and parallel corpora derived from ParaCorpE-LT-E, a corpus made up of original English and Lithuanian fictional texts and their translations into Lithuanian and English, respectively. Among other results, this investigation points at a preference for modal auxiliaries in the case of English, and a preference for modal adverbials in the case of Lithuanian. Additionally, the analysis of translational correspondences validates the hypotheses on features that are different in original English and original Lithuanian. An asset of this paper lies in the fact that it depicts and characterises the process of grammaticalisation in the two languages studied, with English auxiliaries displaying a higher degree of grammaticalisation than Lithuanian ones. This also sheds light on the issue of translation equivalences in relation to grammatical categories.


This book is a welcome addition to the field of contrastive studies viewed from the empirical side of corpus linguistics. The six contributions have the common goal of providing a descriptive and theoretical insight into the differences and similarities between languages, which they do by resorting to a limited array of languages, namely Dutch, Spanish, Lithuanian and, especially, French and English. In all cases, English serves as the tertium comparationis. As has been observed, the volume covers several linguistic domains (e.g. syntax, modality and discourse) and explores diverse types of research questions (e.g. grammaticalisation, pragmatic functions, stylistic functions and typological profile), which make it a varied and attractive work. As is expected for a volume centred on corpus studies, the reader is here provided with a substantial number of examples to empirically back up the points under discussion, to the point that almost every hypothesis and statement is checked against corpus records. This overwhelming amount of data is accompanied by the interaction with statistical tests in order to endorse the validity and authenticity of the experiments. Furthermore, the investigations are carried out by using different types of corpora, and so it is possible to find corpora of a contemporary and historical nature, written and spoken, and embracing a range of text types. This is a positive aspect of the book, restrained only by the reduced number of contributions in it.

If, as can be read in the editors’ introduction, one of their aims is “to enhance the testability, authenticity and empirical adequacy in this field” (p. 2), this seems to be the right moment for this book. The shift that contrastive studies are currently witnessing, where growing attention is being paid to pragmatic and discourse processes, justifies a timely publication and the inclusion of chapters devoted mainly to these two subfields of linguistics, even if one wonders if articles belonging to spheres like phonology or morphology could have joined the collection.

All in all, this is an attractive and readable volume that will hopefully encourage future attempts in the subject of contrastive linguistics, for the languages examined here and beyond. It will be of interest to scholars already working in cross-linguistic areas of language, especially those with a focus on pragmatics and discourse studies, but maybe not so much so to the uninitiated in the field of contrastive linguistics.


Jesús Fernández-Domínguez holds a PhD in English Linguistics and is currently a lecturer at the University of Valencia, Spain. His research has mainly focused on English word-formation and morphology, contrastive linguistics and learner corpora, among other things.