LINGUIST List 24.331

Sat Jan 19 2013

Review: Text/Corpus Linguistics; Translation: Straniero Sergio & Falbo (2012)

Editor for this issue: Monica Macaulay <>

Date: 19-Jan-2013
From: Mauro Costantino <>
Subject: Breaking Ground in Corpus-based Interpreting Studies
E-mail this message to a friend

Discuss this message

Book announced at

Reviewer: Mauro Costantino, Universidad Mayor de San Andrés

EDITORS: Straniero Sergio, Francesco and Falbo, CaterinaTITLE: Breaking Ground in Corpus-based Interpreting StudiesSERIES TITLE: Linguistics Insights -Volume 147PUBLISHER: Peter LangYEAR: 2012

Mauro Costantino, Language and Linguistics Department, Universidad Mayor deSan Andrés, La Paz, Bolivia


“Breaking Ground in Corpus-based Interpreting Studies,” edited by F. StranieroSergio and C. Falbo is a collection of seven papers connected by the samegoal: to start covering the present lack of research about Corpus InterpretingStudies. The content of each paper varies from 'introductory' to 'case study',thus allowing the book to present both new projects under development andapplication of already usable corpora.

Introduction: “Studying interpreting through corpora. An introduction.”(Francesco Straniero Sergio and Caterina Falbo)

This chapter presents a fairly extensive introduction to the work, touching onall the needed topics in order to give a complete reference even for thereader who is not acquainted with corpus-based studies or corpus building andanalysis. Starting from the basics of corpus design (Tognini-Bonelli, 2001)and representativeness (Barbera et al., 2007b), it moves to more specificpoints such as the theoretical and methodological issues of translation andinterpretation corpora. Eventually, the introduction addresses issues ofspoken and speech corpora in relation with transcription and research matters;it sketches the individual approaches of the single corpora presented furtheron in the book, thus guiding the reader through a well organized presentationof the entire work.

Chapter 1: “The European Parliament Interpreting Corpus (EPIC): implementationand developments”. (Mariachiara Russo, Claudio Bendazzoli, Annalisa Sandrelliand Nicoletta Spinolo)

This work presents the implementation and development of the EPIC project in aclear chapter that details all the steps of the corpus planning and buildingprocess. The methodology sections exhaustively present all the steps involved:data collection, the digitizing process, transcriptions (detailing linguisticand paralinguistic level) and eventually the extra-linguistic aspects of metadata and corpus annotation. As for the analysis carried out in the second partof the paper, the work is well structured and presents the research in a clearmanner, which could make the work good also as introductory material forstudents approaching the theme of corpus-based studies, whether dealing withinterpreting corpora or not.

Chapter 2: “From international conferences to machine-readable corpora andback: an ethnographic approach to simultaneous interpreter-mediatedcommunicative events.” (Bendazzoli Claudio)

The second chapter stands a little on the side, compared to the rest of thebook, since it deals mainly with the taxonomic issue of classifyinginterpreted-mediated data. The chapter does not rely on corpora data analysisas the objective of the study like the rest of the papers in the collection.It is, instead, a sound study of the intricate problems, both theoretical andtechnical, that brought about the development of the header of the DIRSI-C(Directionality in Simultaneous Interpreting Corpus). After presenting thedata collection issue, of the DIRSI corpus and multimedia archive, the paperfocuses on the methodological issue of building a corpus of communicativeinteraction, thus comparing the methodology with the EPIC corpus of theprevious chapter. Discussing the theoretical bases for implementing a set ofmeta-data that allows one to distinguish, and therefore query, the variousspeech events and the participants’ roles, it eventually proposes a fulltaxonomy for the DIRSI header. The chapter offers a better insight of thebuilding and planning process that a sound corpus needs, and leaves the fieldopen to further research and development.

Chapter 3: “Introducing FOOTIE (Football in Europe): simultaneous interpretingin football press conferences.” (Annalisa Sandrelli)

The third chapter introduces the FOOTIE corpus in a clear and wellcontextualized description of the building process, as well as the methodologyand the resulting structure of the corpus itself. The paper starts bypresenting aims and goals of the project giving a clear idea of the parametersthat form the corpus structure; the second section gives a brief butexhaustive contextualization, useful for the reader who might not beacquainted with the football translation/interpretation panorama. Datacollection and transcription issues are briefly sketched in the followingsection, referring the reader to the first chapter for more information, thusavoiding unnecessary repetitions. The last, and main section of the paper,focuses on press conferences as a communicative situation, discussing the needfor special treatment in the building of an interpreting corpus; a fair numberof examples supports the author in presenting and discussing the theme.

Chapter 4: “CorIT (Italian Television Interpreting Corpus): classificationcriteria.” (Caterina Falbo)

This fourth chapter presents the ongoing working on the Cor-IT (ItalianTelevision Interpreting Corpus) ranging from the classification criteria totranscription and eventually interrogation features. The presentation ofclassification criteria is complete and clearly articulated (even though itrefers to previous data for a thorough discussion of the criteria), giving thereader a complete panorama of the matter involved in the process of selectingsuch a focal point in corpus building.

Chapter 5: “Topical coherence in television interpreting: question/answerrendition.” (Eugenia Dal Fovo)

This chapter presents ongoing doctoral research based on a sub-corpus of CorIT(Italian Television Interpreting Corpus). The main idea was developed from aprevious MA thesis that the present article uses as a launching pad in orderto develop better criteria for studying the question/answer rendition intelevision interpreting. The question and methodology sections present thework in a complete and well developed manner, clearly stating the researchquestions and detailing the corpus structure and content data.

Chapter 6: “Using corpus evidence to discover style in interpreters'performances” (Francesco Straniero Sergio)

The sixth chapter presents an innovative study about 'style' in interpreters'performances. The work is well presented and discussed, opening the field tosome until now less considered aspects of Interpreting Corpora such as 'style'and 'recognizability', or 'modus interpretandi' as the author calls it. Thisrelatively short paper, supported by the ample use of examples, gives a clearidea of the potential of the tools used for the data retrieval (CorIT, ItalianTelevision Interpreting Corpus). The chapter achieves its goal by setting agood starting point in the Corpus-based Interpreting Research of style andstylistic features.

Chapter 7: “Data collection in the courtroom: challenges and perspectives forthe researcher.” (Marta Biagini)

The last chapter of the collection presents a new project of an InterpretingCorpus based on courtroom recording. Since the project itself is still in itspreliminary phase of data collection, the author presents the theoretical andmethodological issues that characterize the planning stage of such acomplicated project. She details these issues with clarity and goodcontextualization of the Italian court system reality. The research questionsand the procedures for the data collection are well presented and discussed.In the end, the paper presents the project in its preliminary phase and offerssome interesting future development hypotheses in the conclusions, thusachieving its goal.


“Breaking Ground in Corpus-based Interpreting Studies” is a well-structuredand most of all innovative work. As suggested in the title, the actualpanorama of Corpus-based Interpreting Studies is fairly limited and the workattempts to cover this gap.

The work completely achieves the dual goal of discussing ongoing research andof presenting the future perspectives and developments. The theoretical andmethodological discussions are sound and helpful also for the researcher whohas recently begun the corpus-based study of interpreting interactions. It maylack a bit of in-depth analysis of all the potentiality of the searching andindexing methods, a flaw that is easily overcome by the good battery ofexamples and data presented.

The small downsides, some of them mainly editorial, some due to the specificpresentation of data or results do not detract from the results whatsoever, itonly requires more time for the reader to analyze them.

On the low side the corpus linguist reader should be warned if s/he is notacquainted with technical interpreting-related vocabulary, s/he might need alittle researching in case s/he wishes to explore this particular aspect ofthe matter. This trivial shortcoming of the introduction is sometimes sharedby the other sections of the book, due to the point of view of the work, butis anyway simply and quickly overcome halfway through the book, where thereader will already be acquainted with technical terms. In conclusion, dataand objectives are clearly expressed and the chapters serve well the role ofconnecting the subsequent papers.

As far as the corpus linguist reader is concerned, a small inaccuracy might befound in the lacking of complete data (chapter 1); to include types and tokenscount would have given a clearer picture of the corpus. Also, a few lines morecould have been spent in the description of the numerous possibilities of theCWB (Corpus Work Bench) and CQP (Corpus Query Processor) corpus query system(Christ 1994), in order to better explain the possible outreach of the wholecorpus. Nevertheless, it must be noted that all the information can be easilyretrieved through the references.

Only one shortcoming is present in the analysis developed in Chapter 1: whilethe structure is sound and well supported by data, talking about 'trends' and'statistical significance' one would expect values and the statistics, as wellas the data and 'p' values, to be reported in the text.

Similarly, it could have added a lot to the content of Chapter 3’s analysis tohave the details of the corpus (even though partial, or estimated), referringto the content duration in minutes, and word count. Considering theintroductory nature of the research this can be seen as a minor downside, butone that still tends to limit the reader in the interpretation of the scope ofthe research.

As for the analysis presented in Chapter 4, the only shortcoming is thepresentation of figures. Since they are simple screen-shot images, it isdifficult for the reader to actually read the content and do not really addany substantial information to the text. On the other hand, this might beconsidered an editorial shortcoming, not really a content issue.

The second part of the paper presents some controversial concepts related tointerpreting modes, interaction types and genres of spoken discourse whichoffer a detailed panorama on the inner discrete distinctions of the texts thatform CorIT, thus giving a better understanding of the full potentiality of thecorpus. A little bit more space could have been dedicated to the transcriptionand interrogation part, in order to promote more research ideas for futurestudy and development.

The only inconvenience is found in the results section, where the authorpresents many similar graphs and tables that might result in confusing thereader rather than helping her/him. In order to compare between omission andsubstitutions, a more synthetic presentation showing just one table withpercentages would have probably helped. The pie chart presenting the frequency(oddly noted in percentage, instead of numbers) does not add to theinformation presented and it seems a simple table could have done the job ofcomparing frequencies just as well. As for the figures, it appears that fourseparate figures, each one with its own table, presenting the percentage ofsatisfactory, medium and unsatisfactory degree of coherence do not helpunderstanding; presenting each figure with a different scale requires evenmore time for the comparison. One figure with four bars (wh-question, Yes/Noquestion, Leading question, Declarative question), each one divided into itsthree possible results (satisfactory, medium and unsatisfactory) would havepossibly helped comparison and improved clarity. So even though all the neededinformation is actually present, the use of many tables and graphs results inrather a burden to the reader. As for the editorial part, the excel tablespresenting the question and answer under examination could have been convertedinto more reader-friendly examples. In fact these form details do not affectthe quality of the content, but neither it do they support it. Nonetheless,the chapter opens the field to some interesting study on interaction,conversation analysis and topical coherence through Interpreting Corpus, whichis the aim of the entire work.

Finally, due to the very innovative idea of an Interpreting Corpus ofcourtroom recording, a more detailed explanation of the entire projectpresented in Chapter 7 could have greatly added to the paper. Will the corpusbe indexed, will it be POS-tagged (Part-of-Speech), will it be made availableon line? The paper could have put a little more information that would havestimulated possible support and fruitful discussion by the academic community.

In the end it is a well structured work that gives a clear view of the ongoingresearch in corpus-based interpreted studies and stimulates many ideas forfurther development and research.


Barbera, Manuel, Elisa Corino & Cristina Onesti (eds.). 2007a. Corpora elinguistica in rete. Torino: Guerra Edizioni.

Barbera, Manuel, Cristina Onesti & Elisa Corino. 2007b. “Cosa è un corpus? Peruna definizione più rigorosa di corpus, token, markup”, in Barbera et al.,2007a. pp. 25-88.

Christ, Oli. 1994. “A modular and flexible architecture for an integratedcorpus query system”, COMPLEX '94.

Tognini-Bonelli, Elena (ed.). 2001. Corpus Linguistics at Work,Amsterdam/Philadelphia: John Benjamins.


Mauro Costantino is invited professor at the Universidad Mayor de San Andrés(UMSA) of La Paz, Bolivia. His main interests range from Second LanguageAcquisition, comparing the acquisition of the Italian verb system by speakersof different languages, to Translation Studies, to corpus linguistics(focusing on learners corpora). He teaches Italian, translations seminar andintroduction to computational and corpus linguistics at UMSA, activelyparticipates in the VALICO ( and VALERE ( from the University of Torino, Italy.

Page Updated: 19-Jan-2013