LINGUIST List 24.2625

Thu Jun 27 2013

Review: Applied Linguistics; Computational Linguistics: Frankenberg-Garcia et al. (2012)

Editor for this issue: Joseph Salmons <>

Date: 24-Mar-2013
From: Robert Poole <>
Subject: New Trends in Corpora and Language Learning
E-mail this message to a friend

Discuss this message

Book announced at

EDITOR: Ana Frankenberg-GarciaEDITOR: Lynne FlowerdewEDITOR: Guy AstonTITLE: New Trends in Corpora and Language LearningSERIES TITLE: Corpus and DiscoursePUBLISHER: Bloomsbury Publishing (formerly The Continuum International Publishing Group)YEAR: 2012

REVIEWER: Robert E Poole, University of Arizona

SUMMARY“New Trends in Corpora and Language Learning” provides a comprehensive look atrecent developments in corpus approaches for teaching and learning language.The 15 chapters were developed from presentations delivered at the 2008Teaching and Language Corpora Conference (TaLC). Part 1 features chaptersdetailing current approaches to using corpora and corpus tools by languagelearners from contexts around the world. These chapters explain approachesthat place data in the hands of the learner and report the benefits of suchinstruction and learners’ responses to the methods and tools. The next sectiondiscusses tools from multimodal concordancing software to a collocationfeedback program that learners can employ and exploit for their languagelearning in addition to chapters detailing recent developments in machinetranslation and parallel corpora. Finally, section three includes chaptersdiscussing insights made possible through analyses of learner corpora and thepedagogical implications of the findings.

PART I: Corpora with language learners: useOpening the text, Yukio Tono’s chapter “TaLC in action: recent innovations incorpus-based English language instruction in Japan” details several novelcorpus-based applications while reporting the popularity of corpora in Japanand the potential for its success elsewhere. Of particular interest was thedescription of a corpus-based TV English program that more than 1 millionpeople watch per year. It spawned a popular children’s character, “Mr.Corpus”, and won an award for best TV program from Japan's public broadcastingcenter. The show enjoyed such great popularity that similarly themed iPhoneapplications have been produced and a corpus-based Wii game application isbeing developed. These success stories, Tono asserts, display the potentialfor and viability of corpus-based approaches in Japan and elsewhere.

The second chapter, “Using hands-on concordancing to teach rhetoricalfunctions: evaluation and implications for EAP writing classes” by MaggieCharles, presents a discourse-analytic approach for the teaching of writing.While a common critique of corpus pedagogy has been its focus on bottom-upapproaches to language learning, Charles presents a model that integratestop-down and bottom-up processing that moves learners beyond the lexicogrammarof individual sentences to rhetorical features of the discourse. The 49international graduate students responded quite positively to the approach andnoted the affordances the corpus approach provides for the teaching andlearning of academic writing. In closing, Charles presents a three-stageprocess that she believes will transition students from corpus awareness tocorpus literacy and finally, corpus proficiency.

Another chapter detailing a corpus-based pedagogical approach is presented byBernhard Kettemann in Chapter 3. “Tracing the emo side of life: using a corpusof an alternative youth culture discourse to teach culture studies” presentsan approach for the teaching of a particular alternative discourse in auniversity-level Cultural Studies course. Students displayed motivational andengagement gains and the corpus-based approach was claimed to advancestudent-centered learning while also providing a valued alternative totraditional theory-based frameworks and texts. Through a combination ofdeductive and inductive learning, students displayed an increase in awarenessof the connection between language and culture. Kettemann asserts the valueof integrating corpus work into mainstream pedagogy but also acknowledgeschallenges, e.g. text and text type selection, that must be overcome forcorpus study to succeed.

Pedagogical approaches continue with Natalie Kübler in Chapter 4 onapplications of corpora for translation and the teaching of translators.“Working with corpora for translation teaching in a French-speaking setting”explains limitations facing more complete integration of corpus approaches,e.g. limited availability of parallel corpora, but asserts the potential ofcorpus translation of specialized texts and the need for translators intraining to receive instruction in the basic concepts of corpus linguistics.Kübler also writes that translators need to have the ability to constructtheir own specialized corpora for particular translation tasks. The chapterpresents several classroom-tested activities for raising awareness of corpusfor translation tools and the potential learning gains.

The final chapter of section one, “IFAConc: a pedagogic tool for onlineconcordancing with EFL/EAP learners” by Przemyslaw Kaszubski, presents andassesses an online concordancing program for the teaching and learning ofacademic writing by university students in Poland. The IFAConc concordancepackage, designed to meet the pedagogical needs of students in an EAP writingclass, was created with the learner in mind; search parameters, annotationfeatures, and search history interfaces were made as intuitive anduser-friendly as possible. However, the pedagogical aims of the concordancepackage do not limit its versatility, as Kaszubski’s design enables many typesof inquiries into linguistic features while also making sharing, saving, andannotating findings possible. Piloted in two classrooms and receivinggenerally favorable responses, the package, Kaszubski notes, is constantlyevolving as updates and improvements are periodically implemented into thesystem. The practicality of the tool and its potential for more completeintegration into an EAP writing curriculum are indeed promising.

PART II: Corpora for language learners: toolsSection 2 begins with a chapter from Anne Li-E Liu, David Wible, and Nai-LungTsao titled “A corpus-based approach to automatic feedback for learners’miscollocations” that details a method for identifying miscollocations in L2learner writing and a means for providing immediate suggestions of propercollocations to the user. Applying the notions of intercollocability andsubstitutability, the software identifies collocation clusters that enableidentification of miscollocations and makes recommendations for corrections.The collocation cluster and intercollocability information are shown to bevalid means of correcting miscollocations. With issues of detection andcorrection seemingly resolved, the authors explain how the tool could beintegrated into an online language learner platform to be used by secondlanguage writers.

One of the more intriguing chapters is Francesca Coccetta’s “Multimodalfunctional-notional concordancing”. She notes that corpus approaches havetraditionally been employed for the analysis of written texts; however,Coccetta’s rather novel approach shows how a spoken corpus of audio and videotexts can be organized, annotated, and exploited for language learning andteaching. The program provides insights into the various semiotic resourcesat play in the creation of meaning. Beyond detailing the multimodalconcordancer and a scalar method for annotating oral discourse, Coccettapresents two data-driven activities for language learning. The chapter raisesinteresting questions for corpus techniques and their application to oraldiscourse while asserting the need for greater use of corpus approaches forthe teaching of speaking and listening.

Chapter 8 by Alejandro Curado Fuentes, “Academic corpus consultation in MT andapplication to LSP teaching”, presents a sophisticated content-based machinetranslation approach (CBMT) that aims to produce translations of writtenEnglish into Spanish. The n-gram based approach, when applied to a corpus ofwritten academic discourse, demonstrated the ability to identify a variety oflinguistic data. The system, as Fuentes asserts, improves the quality ofmachine translation of specialized texts and can significantly decrease theamount of time and cost required for translation. Fuentes further states thatthe approach may be exploited by teachers of English for Specific Purposes toteach particular specialized discourses through a contrastive corpus-baseddata-driven learning approach.

Martin Warren follows in Chapter 9, “Using corpora in the learning andteaching of phraseological variation”. Warren explains ConcGram (Greaves,2009) and its ability to identify and display output in a manner quitedifferent from the more traditional keyword in context (KWIC) format. Hestates that while a KWIC display features a centered node word, ConcGraminstead highlights the node as well as co-occurring words in a layout thatdraws learner attention away from the node item to its surroundingco-occurring features. The ConcGram approach is lauded for its ability toidentify three types of phraseological variation: meaning shift units(Sinclair, 2007), collocational frameworks (Renouf and Sinclair, 1991) andorganizational frameworks. The author states that traditional n-gram focusedapproaches exhibit only a limited view of variation in phraseology. Warrensuggests concgramming can serve as a tool for textual analysis, an approachfor raising learner awareness of the idiom principle, and a means forrevealing field and genre specific discourse features.

In Chapter 10, “The SACODEYL search tool: exploiting corpora for languagelearning purposes”, Johannes Widmann, Kurt Kohn, and Ramon Ziai report on apedagogically-motivated user-friendly spoken language corpus of videointerviews of secondary school students representing 7 European languages.Each language corpus has 25 interviews, annotated and aligned with theirtranscripts. The corpora require little training, are user-friendly, and aredesigned with a language learner in a secondary school context in mind.Reflecting its focus on younger learners, the corpus is divided by topics suchas hobbies and plans for the future. The authors comment that thistopic-oriented construction differs from many traditional concordancingprograms as it allows students to focus on areas of particular interest. Inaddition, the package comes with pedagogical materials to aid the teacher inmaking lesson plans.

PART III: Corpora by language learners: learner languagePart III opens with a chapter from John Osborne, “Oral learner corpora and theassessment of fluency in the Common European Framework”. The chapter detailshow findings from learner corpora may be applied to the assessment of foreignlanguage oral production. In the project, interviews were independently ratedusing the Common European Framework (CEF) standards and then analyzed for avariety of quantitative and qualitative features such as pauses, length ofutterance, syntactic units, and information units amongst several others. Theauthor displays how benchmarking has the potential for automatic rating oforal productions. While this study indexes the interviews using CEF standards,application of other frameworks is also possible. The author does mentionseveral limitations but notes that automatic measurements can quickly provide‘rough’ and useful profiles of a learner’s fluency.

Chapter 12, ''Preferred patterns of use of positive and negative evaluativeadjectives in native and learner speech: an ELT perspective'', is acontribution from Sylvia De Cock on the patterns of negative and positiveattitudinal stance markers in native and learner speech and offers severalimplications the findings have on English language teaching (ELT). Through acontrastive analysis approach, the study identifies variation in syntactic andcollocational patterns of attitudinal markers and finds several items thatcould be treated in the classroom. For example, De Cock finds native speakerpreference for evaluative adjectives occurring frequently in relative clausesbeginning with “which”. However, this syntactic preference occurs with muchlower frequency in the learner corpora. The author suggests this feature andseveral others explained in the chapter could be included in ELT materials,and activities based on the native and learner data could be successfullyintegrated into the classroom.

Hilary Nesi in Chapter 13, ''BAWE: an introduction to a new resource'',introduces the British Academic Written English (BAWE) corpus and discussesits construction and design. The corpus consists of approximately 3,000written university assignments compiled in response to the concern we hadinsufficient information about the types of academic writing studentscompleted. The author details the 4x4 design matrix that was used forsystematic collection and organization of the assignments across four levelsand four broad disciplinary groups. The author notes the unique constructionof various levels and disciplines of the corpus that distinguishes thecollection from other similar corpora, e.g. the Michigan Corpus of Upper-LevelStudent Papers (MICUSP) (Römer and Wulff, 2010) and the Portland StateUniversity Corpus of Student Academic Writing (Conrad and Albers, 2008). Thecorpus was annotated along several dimensions such as functional features andgenre family. The authors close with a review of several publications withfindings from the BAWE and suggest further research that may be conducted withthe use of the corpus.

Continuing with findings from learner corpora is Anna-Maria Hatzitheodorou andMarina Mattheoudakis’s chapter, ''The impact of culture on the use of stanceexponents as persuasive devices: the case of GRICLE and English native speakercorpora'', that compares stance and persuasive devices in a Greek learnercorpus and an English native speaker corpus. The study investigatesdifferences in how the two groups deploy rhetorical strategies to persuadetheir reader. The research is informed by Hofstede’s (1980) model of culturaldimensions with differences in stance markers interpreted using the framework.One example the authors report is that Greek writers use persuasive boosters(e.g. of course, undoubtedly) more frequently than hedges and attitude markerswhile many fewer instances of boosters were found in the native writer corpus.They also report that native writers are more likely to use hedges andtypically refrain from using boosters in their writing. Applied to theHofstede model, the authors suggest the difference can be explained throughAnglo-American rhetorical conventions that discourage bold statements andinstead leave space for alternative opinions. The authors detail several otherdifferences in the use of stance markers while offering interpretations of thevariation through the Hofstede model. The authors correctly caution againstexplicit and prescriptive instruction but do suggest that L2 learners couldbenefit from consciousness-raising activities that illuminate connectionsbetween culture and writing practices.

The text closes with a chapter, ''Polishing papers for publication:palimpsests or procrustean beds?'', from John McKenny and Karen Bennett thatcompares articles submitted to journals written by Portuguese academics to acorpus of native speaker journal articles published in the same field. Thestudy investigates variation in syntactic, lexical, phraseological, anddiscourse features that may impact the ‘naturalness’ (p. 247) of the texts andthat may function as an obstacle to publication. The authors revealdifferences in a variety of features such as use of nominalization, overuse ofthe genitive, and collocational patterns. While the authors do not advocatestylistic norming and acquiescence to perceived native speaker norms, they docall attention to the real repercussions possibly experienced by L2 writersseeking to publish in international journals. Similar to other chapters, theyrecommend awareness-raising activities while also advocating the value corpusstudies can have in revealing cultural differences in academic writing.

EVALUATIONAs evident in the chapter summaries, this recent publication on trends incorpora and language learning covers a variety of issues, presents compellingadvances in corpora for numerous contexts and purposes, and raises importantquestions for further research. From a corpus-based television program torhetorical discourse annotating and on to multimodal concordancing, thepossibilities for continued development of corpus tools and the potential forgreater integration of corpus approaches into the classroom is clearly ondisplay. However, several chapters lack the type of empirical evidence neededif corpus approaches are to gain greater access into mainstream classrooms.

While the insights into learner attitudes are indeed valuable, furtherresearch into learning gains is needed. This need for continued research isnoted in many chapters as authors consistently pose questions and presentchallenges for future research to address. Also, no chapter directly speaks tothe need to train future language teachers in corpus linguistics and corpuspedagogy; the one chapter on training dealt with translators. Nonetheless, thebook makes a valuable contribution and many of the ideas here will inspirethose seeking increased integration of corpus approaches in language learningenvironments. These authors indeed push the field in interesting directions asthey move corpus approaches beyond the bottom-up approaches that characterizedearlier work in the field to more dynamic strategies.

From pedagogy to corpus tools and learner corpora analysis, this volumecoherently surveys the latest developments in corpora while also consistentlyraising questions and encouraging continued research. Whether a reader’sinterest is classroom pedagogy or software developments, this comprehensivetext on new trends in the field will certainly be of value. Importantly, thisvolume will appeal to a wide audience as it offers plenty to interest thosefamiliar with corpus approaches while remaining accessible to those new to thearea.

REFERENCESConrad, S. & Albers, S. (2008). A new corpus of student academic writing.Paper presented at the American Association of Corpus Linguistics Conference,Brigham Young University, Utah.

Hofstede, G. (1980). Culture’s consequences: international differences inwork-related values. London: Sage.

Johns, T. (1994). From printout to handout: Grammar and vocabulary teaching inthe context of data-driven learning. In T. Odlin (Ed.), Perspectives onpedagogical grammar. New York: Cambridge University Press, 293-314.

Renouf, A.J. & Sinclair, J. (1991). Collocational frameworks in English. In K.Aijmer and B. Altenberg (eds.), English corpus linguistics. London: Longman,128-143.

Römer U. & Wulff, S. (2010). Applying corpus methods to writing research:exploration of MICUSP. Journal of writing research, 2(2), 99-127.

Sinclair, J. (2007). Collocation reviewed. Manuscript. Tuscan Word Centre,Italy.

ABOUT THE REVIEWERRobert Poole is a Ph.D. student in the Second Language Acquisition andTeaching program at the University of Arizona. His research interests includecorpus linguistics, corpus pedagogy, and discourse analysis.

Page Updated: 27-Jun-2013