Review of  Introducing Corpora in Translation Studies

Reviewer: Annelie Ädel
Book Title: Introducing Corpora in Translation Studies
Book Author: Maeve Olohan
Publisher: Routledge (Taylor and Francis)
Linguistic Field(s): Text/Corpus Linguistics
Issue Number: 16.2450

Date: Wed, 17 Aug 2005 21:24:49 -0400
From: Annelie Ädel
Subject: Introducing Corpora in Translation Studies

AUTHOR: Olohan, Maeve
TITLE: Introducing Corpora in Translation Studies
PUBLISHER: Routledge (Taylor & Francis)
YEAR: 2004

Annelie Ädel, English Language Institute, University of Michigan, Ann


Maeve Olohan's "Introducing Corpora in Translation Studies" is an
introductory work that explains how the analysis of corpus data can
make a contribution to the study of translation. Its primary audience
is "those who are familiar with translation studies but not corpora" (p.
1), but it is clearly also of interest to those who already use corpora. It
deals with the role of corpora in three areas of translation studies: (i)
translation studies research, (ii) translator training and (iii) translation
practice. For an indication of how firmly the focus is placed on the first
of these, we might note that only 8 pages are dedicated to (ii) and 14
to (iii) out of a total of 192 pages. The book documents the early years
of corpora in translation studies (originating with Mona Baker's 1993
book) and "gives an insight into some of the difficulties and
achievements" (p. 1) so far.

Chapter 1 introduces translation studies research. It begins with a
discussion of "the lack of consideration or relative invisibility of the
translator" (p. 4) and ends with an overview of recent theoretical
approaches taken in translation studies. The author makes it clear
that she finds it more fruitful to consider the corpus-based approach a
methodology than to consider it a "paradigm" of its own. Chapter 2
gives a brief introduction to corpus linguistics and descriptive
translation studies. It then summarises recent translation research that
has made use of corpus-linguistic tools.

Chapters 3 and 4 deal with parallel and comparable corpora,
respectively. Chapter 3 is critical of the way in which parallel corpora
have traditionally been used in contrastive linguistics. In Olohan's
analysis, instead of showing interest in the translation process per se,
researchers have considered the translations in the corpus "first and
foremost a reflection of the possibilities offered by the target language
system" (p. 24). In chapter 4, the use of comparable corpora in
translation studies is reviewed, with special attention given to research
that aims to find universal features of translated language (according
to Baker's original suggestion in 1995). The kind of comparable
corpus work discussed here "argues in favour of studying translations
without looking directly at source texts or at the relationship between
source and target text" (p. 43).

Chapter 5 is about corpus design, especially as it applies to
translation studies. It also gives a range of practical advice on corpus
compilation, even paying attention to the often neglected issue of
copyright and difficulties in obtaining permissions. The chapter does
not go into much technical detail. Alignment, for example, is treated
quite briefly. Olohan ends the chapter by turning our attention to six
different corpora used in translation studies research, by summing up
the design criteria used.

Chapter 6 generally describes corpus tools and data analysis. Like
most introductory textbooks on corpus linguistics, it describes
phenomena like concordances and POS-tags. It also provides some
simple quantitative measures, primarily frequency lists, type/token
ratio and keywords. It is understandable that the author wishes to give
the intended audience a basic introduction to primarily monolingual
corpus tools, but the next edition will hopefully include more than a
page on tools that directly apply to translation studies.

Chapters 7 and 8 summarise previous and current corpus-based
research in translation studies. Chapter 7 revolves around the idea
that translated language is "characterized by specific, identifiable
features that may be related to the nature of the translation activity
itself" (p. 90). The features of translation on which the author focusses
are explicitation, normalization, simplification and "levelling-out" (all
suggested by Baker in the mid-1990s). Olohan also includes four of
her own case studies to illustrate in more detail how corpus-linguistic
methods can be used to explore these features. Chapter 8,
entitled "Translators, style and ideology", primarily gives suggestions
for how to use corpora to analyse a translator's style, briefly reviewing
some of the literature on stylometry and stylistics. The ideology aspect
is merely touched upon. Two of the author's own case studies
conclude the chapter, which look at contraction patterns and lexical
choices using keyword analysis.

Chapters 9 and 10 shift the focus from theory and research methods
to useful applications for students of translation and professional
translators. The chapters are aptly called "Corpora in translator
training" and "Corpora in translation practice". The former focuses on
suggestions for the use of parallel and comparable corpora by
students and teachers of translation. The latter gives examples of how
corpus methods can be used in technical as well as literary
translation. It also gives a brief outline of corpus availability on the

The three-page Conclusion is used to highlight some of the "more
salient issues raised in the volume and touch upon potential future
developments in the use of corpora in translation studies" (p. 190).
Here, Olohan brings up the necessity of building corpora of translation
for a larger number of languages. She also concedes that her
book "will appear to have foregrounded research into literary
translation" (p. 191) and goes on to motivate why this should be the
case. Olohan draws attention to the fact that corpus-based studies of
translation overwhelmingly deal with contemporary texts and
encourages diachronic perspectives on translation studies research,
specifically in order to explore the influence of norms. She goes on to
argue that there is "still scope for continued cross-fertilization" (p. 191)
between translation studies and other disciplines, giving stylometric
methods as a prominent example of a discipline that has contributed
to studies of translator style. Another promising method that Olohan
draws attention to is the "dual approach" (p. 192) of combining
findings from comparable and parallel corpus analyses.

The final point Olohan highlights in the Conclusion is that finding
richer causal models (involving the formulation and testing of
explanatory and predictive hypotheses) is both desirable and
achievable, but only if we combine corpus techniques with other
analytical tools: "we need to study not just the texts but the translation
situation, from perspectives that are social, cultural, historical, political,
cognitive, and so on" (p. 192).

All chapters end with a brief passage of recommendations for further
reading. The majority of chapters also include a final section
called "Discussion and research points". There is also a three-page
glossary, which, although helpful, is surprisingly brief (there is, for
example, no definition of a comparable or parallel corpus). Also,
judging from the lack of consistency in the definitions, the glossary
would have benefitted from further editing.


Olohan's textbook will, no doubt, be a "must read" not just among
researchers in translation studies interested in corpus methods, but
also among corpus linguists interested in translation. Reading this
book is an excellent way to get up do date with recent developments
in the use of corpus methodology in translation studies. The chapter
on corpus design is a particularly excellent introduction with plenty of
good advice and tips for the amateur corpus compiler. Not
surprisingly, references to Olohan 2004 are already showing up in
other publications in the field, such as Aijmer & Alvstad (2005:1) who
argue that "[t]he methods of corpus linguistics and the use of corpora
have become an [...] important tool in translation studies reflecting the
growth of computer technologies and the use of corpora in general
linguistics (Laviosa 2002, Olohan 2004)".

On the whole, the discussion and research points that end every
chapter address relevant and useful issues. Only occasionally do the
questions extend much beyond the book itself, as in the case of a
question about identifying "points at which translation studies adopted
ideas, theories and methods from contrastive linguistics" (p. 34). The
question is raised despite the fact that contrastive linguistics is not
described in any systematic way, which makes it impossible to answer
without more background from some other source. Also, I find
somewhat ungenerous the fact that contrastive linguistics is dealt with
from the perspective of no more than one article (Altenberg 1998).

The chapter on corpus tools and data analysis mainly deals with
general concordancing tools for monolingual corpora. There is only
one page of running text on bilingual concordancing (p. 75) in
particular, which is surprising. Even semantic prosody is given more
space, although it is not treated from a translation perspective at all.
The final section on statistics will probably also leave most readers
wondering whether any particularly useful measures exist that apply to
translation. In other words, the chapter could have been more
targeted to translation. As it stands, only 3 pages out of 28 deal
directly with translation.

Several case studies from the author's work in progress are presented
in chapter 7. These studies are generally interesting and the budding
corpus-using translation researcher will doubtless find plenty of food
for thought here. Only occasionally do the case studies make the
chapter (by far the longest chapter of the book) seem overly rich in
information. For example, the chapter confusingly re-uses the same
table three times (7.7, 7.11, 7.17), instead of referring back to one
single table.

The chapter on corpora in translator training is only 8 pages long,
which makes it the shortest chapter of the book. It would have been
highly interesting to have more background here. For example, a
discussion of questions like "How widely used are corpora in translator
training?" and "Are they becoming the norm?" would have been useful.

The chapter on corpora in translation practice contains a section
called "Web as corpus", which takes a somewhat naive approach. The
only argument mentioned against using the web as a corpus is that
the "results will not appear in a format appropriate for browsing
linguistic data" (p. 184). Fundamental issues in corpus design are
conspicuously absent, like representativeness, reliability (the extent to
which an investigation yields the same results on repeated trials) and
verifiability (whether other researchers who have access to your
material can verify or falsify your results), which enable researchers to
make generalizations about language use. Related to this overly
insouciant view of what a corpus is is another point that I would have
liked to see discussed more, namely, the claim that a translation
memory "may be regarded as a type of parallel corpus" (p. 187). This
is misleading from the point of view of (attempted) representativeness
being a basic characteristic of a corpus, as defined by corpus
linguists. A translation memory does have aligned sentences in two
different languages, but the resemblance to an actual linguistic corpus
basically ends there.

The book shows a laudable awareness of the common Anglo-
American bias in corpus linguistics and explicitly states a desire to
spread the use of corpora to a larger number of languages. The latter
is emphasised as a "salient issue" in the book--it is important that "we
continue to expand the languages for which we build corpora for
translation" (p. 190)--although such statements are more or less
restricted to the Conclusion. Chapter 10 does include a section on
the "availability of corpus resources worldwide", but it gives a fairly
insular impression in that only the UK and the main languages
relevant to the situation in the UK are mentioned.

The absence of any discussion of machine translation (MT) is
noteworthy. MT is a large area of corpus application on the basis of
number of translations performed, yet it is only mentioned twice--in
passing. Of course, Olohan is certainly not the only culprit in this
respect and, considering that translation studies is an emerging field,
with the use of corpora being particularly recent ("spanning no more
than ten years", 1), it is to be expected that there will be discussion
regarding delimitations and definitions. I do think, however, that the
intended audience will be left wondering why MT is not offered some
space in a large publication on corpora in translation studies,
especially since chapters on translation training and translation
practice are included. Furthermore, since translation studies "is the
academic discipline that concerns itself with the study of translation"
(p. 1), the potential and limitations of MT should be highly relevant

Olohan's book is written in a lucid style and the topics are presented
in a clear manner. (My main complaint concerning reader-friendliness
is that more illustrations would have made it even clearer.) Yet
another merit of "Introducing Corpora in Translation Studies" is that it
is unusually balanced for an introductory book. The fact that it
concludes that corpus techniques "can only go so far on its own" and
that they will "play a vital role in combination with a range of other
approaches and methods" (p. 192) seems to suggest that corpus
linguistics is becoming more mature.


Annelie Ädel is a postdoctoral fellow at the English Language Institute
of the University of Michigan, Ann Arbor. She earned her Ph.D. in
English Linguistics from Göteborg University, Sweden, in 2003. Her
research interests include text and corpus linguistics, discourse
analysis, translation and contrastive linguistics. She has presented her
work at conferences in Sweden, Norway, Italy, Spain, Belgium, the UK
and the US, and recently spent two years as a visiting scholar at
Boston University. She combines her research and teaching with work
as a professional translator.

