Review of Exploring Corpora for ESP Learning
|
|
|
|
|
Review:
|
AUTHOR: Gavioli, Laura TITLE: Exploring Corpora for ESP Learning SERIES: Studies in Corpus Linguistics 21 PUBLISHER: John Benjamins YEAR: 2005
Philip M. McCarthy: Institute for Intelligent Systems, Department of Psychology, The University of Memphis
As a teacher and researcher in corpus linguistics, I have become increasingly frustrated over recent years by the mysterious lack of corpus analysis applications. There has been an avalanche of articles, books, and conferences dedicated to the glories of corpus linguistics and yet, there has been a dearth of practical, simple, solid, and (above all) interesting pedagogical applications. For me then, corpus linguistics in the classroom has been like Mars glittering in the night sky: so full of possibility and yet no one can tell us much more than there might be water there. To justify a journey to Mars, we need to know what we could do with it once we have invested so much to get there. To justify the investment in building corpora for the classroom, we need to know, clearly, how we investigate corpora, and what corpora offer us that a traditional classroom text book does not. In “Exploring corpora for ESP learning,” I am happy to report that the Eagle has landed, and that there is an abundance of not only of water but life itself.
TESOL has grown to considerable prominence over recent years, and despite its less than glamorous reputation, it has also managed to produce its own offspring in the form of English for Specific Purposes (ESP). And while ESP (lacking a fancy academic bloodline) may never be considered a legitimate heir, the wealth and interest such a field appears to be producing has made TESOL (in general) and ESP (in particular) a field that academia will, at the very least, have to send an invitation to for all forthcoming linguistic balls. Of course, with English being something of a world language, the wealth generated by TESOL (in general) and ESP (in particular) will forever mean that these fields cannot be completely ignored. What few seem to have recognized however, and Laura Gavioli is clearly a notable exception, is that the golden child of (corpus) linguistics appears to have taken rather a shine to ESP. That is, corpora are relatively small collections of samples of a particular language register (e.g. science texts, public speeches, telephone conversations etc.) and ESP is a field dedicated to teaching these registers as specialized, distinct, (almost quirky) modules of the blurry and so often contradictory superstructure. Thus, what corpus is, is what ESP teaches, and the two are forever joined at the ''lip''. In this succinct, yet thoroughly informative offering, Gavioli has managed to demonstrate this happy marriage of beauty and the beast. In seven concise chapters, Gavioli outlines just why it is that corpus analysis in an invaluable component of language research and language teaching, making this text an invaluable read for any teacher of ESP and any researcher interested in corpus studies.
SUMMARY
In the introductory chapter, Gavioli lays out her theme that small specialized corpora can help students better understand the idiosyncratic (or specialized) language of a given discourse community. Naturally, the emphasis on specialized language and specialized corpora gears the text very much towards an audience of ESP teachers; however, as Gavioli points out, even the most general of language courses will also feature many classes specializing in a variety of registers (e.g., letter writing, telephone calls, and job applications).
Chapter 2 serves as a brief history of corpus linguistics and a brief outline of the importance that corpus work can play in language pedagogy. Concerning the history, Gavioli relates the story of modern corpus development starting in the 60s and 70s and highlighted by the development of the Brown and LOB Corpora. Gavioli then moves on to the 70s and 80s and outlines how Chomsky’s work led to what many view as a hiatus in corpus investigations. In the 1990s, Gavioli argues, there was a revival in corpus work led by such figures as John Sinclair and his Cobuild project. Gavioli also explains that the 1990s saw a significant development not only in technology, but critically, in the availability of that technology. This technology, Gavioli argues, has allowed researchers (and teachers and students) to collect, store, and analyze data easily and cheaply.
Of course, the revival of corpus research has not been without its problems, and Gavioli is careful to detail such considerations. For example, Gavioli explains that corpora are _samples_ of language, rather than language itself, and we must be careful to recognize the limits these samples provide. Gavioli also cites Widdowson’s (1998) argument that “reality does not travel with the text” (p. 711). That is, a corpus can give us examples of language, but it does not provide the context from which the language was produced. As such, the information that the corpus supplies is not synonymous with all aspects of meaning. From a pedagogical point, Gavioli also reminds us of two further limitations of the corpus. First, we must remember that exposing students to the “real language” of a corpus does not mean that students’ language will improve. That is, exposure itself is not enough: students also need to be trained how to use and interpret corpus data. And second, Gavioli cites Carter (1998) who argues that while corpora often supply excellent authentic examples of language use, invented language examples are often more concise and more useful to students.
But while there are many concerns that researchers, teachers, and students need to consider when using corpora, there are clearly many benefits too. For example, corpus and concordance work may assist the teachers in syllabus design (Flowerdew, 1993); in the analyses of discourse markers (Zorzi, 2001); in translation, synonymy, and issues of false friends (Partington, 1998); and in helping students to become autonomous researchers, noticing problems and forming theories as to usage (Johns, 1991). Perhaps of most importance to the theme and goal of Gavioli’s text, however, is the “idiom principle” (Sinclair, 1991; 1996). This principle posits that there are numerous and ubiquitous regularities within language that are indicative of certain registers. These regularities cannot be explained by grammar or lexical-logical systems alone. They can, however, be revealed by corpus investigations. This idiom principle, more so than any other benefit of corpus work, forms the backbone of Gavioli’s argument for the necessity of presenting corpus analysis in the ESP classroom. While the remaining chapters sees Gavioli develop many of the arguments listed above, the idiom principle will be a constant theme, underlining and emphasizing the marriage of corpus analysis and ESP classes.
Chapter 3 focuses on developing the theory and evidence for the Sinclair (1991) notion of the “idiom principle” and how, given such a foundation, students need to be trained to recognize and understand the often complicated output generated in corpus analysis. The idiom principle is a sociolinguistic/conventional explanation of language features that contrasts with the rationalistic principle associated with the work of Chomsky. Gavioli argues that proponents of the idiom principle supplied much evidence suggesting that word combinations (collocations) and lexico-grammatical combinations (colligates) occur as a matter of convention or fashion rather than as simple logical or grammatical inevitabilities. In this chapter, Gavioli highlights the evidence for the idiom principle and outlines the importance of teaching its conclusions to language students. Gavioli also makes clear that while concepts such as the idiom principle remain key for demonstrating the importance of corpus work, we cannot expect students to recognize that importance without training. As such, the chapter spends much time outlining how, where and why students need to be guided on the interpretation of data generated from corpus investigations. Gavioli demonstrates that without this guidance students will tend to use corpus data poorly and, consequently, will neither benefit nor enjoy the experience. In Chapter 4, Gavioli focuses on the emergence of specialized corpora of registers and how they might be used in comparison to general corpora. Gavioli begins by outlining the 1990s debate on corpus size, where ‘large corpora’ were often viewed as not large enough, and ‘quite small corpora’ were equally often viewed as perfectly sufficient. Gavioli points out that ESP is often served better by the smaller, more specific, “specialized” variety of corpora. These smaller corpora focus on instances of language indicative to a particular register. As a consequence, results of such analyses tend to be more reliable inasmuch as the idiosyncrasies of the register are not drowned by features that are common in more general corpora.
In chapter 5, Gavioli offers more advice and guidance for introducing students to interpreting corpus analyses. Gavioli begins by pointing out that students will probably be unfamiliar with tools such as concordancers and the output that they generate. Students may also not appreciate that a concordancer produces results based on samples and that these samples cannot be relied on to be “the truth.” Students also need to understand that a concordancer does not explain results, as a dictionary, a book, or a teacher might. As such, students need to understand that they have a responsibility in unearthing meaning and usage from the data. A final point the Gavioli raises is one that might easily have been overlooked: students need to understand that results generated from corpora are representative of a certain register, and that whatever is concluded from those results may not be necessarily be true of the language in general. Once again then, the marriage of corpus analysis and ESP classes is emphasized.
Gavioli goes on to explain that interpreting concordance data is an inductive activity. Such activities, she argues, are not untypical of the classroom where a teacher often presents examples and the students learn to generalize rules from them. That having been acknowledged, however, Gavioli also makes clear that there is a difference between “samples” and “examples.” The blackboard is generally the place of a limited number of good, clear examples. A concordance, on the other hand, may be a list of hundreds of cases, few of which are particularly clear examples of a meaning or usage. Sheer volume of raw ‘samples’ is not good enough to help students determine the meaning or usage of a phrase or word.
To help begin working with corpus data, Gavioli recommends simply starting by telling students what a concordance is. This is followed by gradually introducing how a concordance is different from blackboard examples. Much of the remainder of the chapter shows how this might be approached. The examples and tasks provided are clear and broad, and should provide an adequate base for many teachers to adapt the approach to the needs of their own classrooms.
In Chapter 6, Gavioli discusses how the student can become part of the relevant discourse community. Gavioli argues that corpora of specialized language offer students a collection of language samples that are indicative of the discourse community, its beliefs, and its conventions. Corpus tools allow students to filter this information, highlighting the aspects of language that are particularly relevant to the student. Chapter 6 differs from the previous chapter in that the emphasis on investigation switches to the student interest rather than the teacher interest. As such, examples and approaches are presented for a more autonomous student.
Chapter 7 summarizes the goals and findings of the preceding chapters. Gavioli also takes this opportunity to pose a number of questions for future research to consider.
EVALUATION
While Gavioli presents a succinct, informative, and much needed text that any teacher of ESP would certainly benefit from, the limited size and scope of the book (and a not inconsiderable price tag) leave a number of areas open to criticism. First, Gavioli maintains that her book is more about 'how we learn' than 'how we teach' and that her presentations are only a guide for teachers and it is for individual teachers themselves to decide how such activities and resources might suit their own classes. Such admissions are honest, accurate, fair, and yet insufficient. The text certainly presented a number of examples of student reaction to class exercises, but, to this reader at least, the text actually provided little on 'how we learn', and may have been not so much a guide as it was a pointer. That is, I am perhaps less convinced than Gavioli that many ESP teachers can so easily turn her well intended (though limited) examples into appropriate exercises.
A second criticism of Gavioli’s approach is a criticism common to most texts concerning corpus analysis. That is, why is there always the assumption that it must begin and end with a concordancer? Even if we accept completely Gavioli’s argument that corpus analysis allows us to go beyond lexico-syntactic language learning and into the idiosyncratic register-relevant language, then why can we not discuss the abundance of other kinds of tools that are available? For example, there is Coh-Metrix (Graesser, McNamara, Louwerse, & Cai, 2004), LIWC (Pennebaker & Francis, 1999), Landscape (Tzeng, van den Broek, Kendou, & Lee, 2005) and many other (often far more simple) methods through which corpora can be studied and language registers can be appreciated. This is not to say that Gavioli should have reported on these tools as often as she did the concordancer; however, it is to say that corpus analysis for researchers, teachers, and students must consider the advantages and benefits of tools other than concordancers.
A final criticism concerns the text’s lack of quantitative evidence. Obviously, ESP does not have a strong tradition of statistical analysis, but as demonstrated elegantly in Mind and Context in Adult Second Language Acquisition (Sanz, 2005), it is more than possible to bring quantitative terminology and evidence to a text without overburdening even a complete novice. Texts on corpus analysis that ignore quantitative evidence are doomed to endless instances of phrases such as “it seems to me”: the prevalence of this phrase in Gavioli’s text became ever more frustrating with each encounter, ensuring us that whatever the merits of the text (and there are many) the real evidence may be little more than Gavioli’s heartfelt opinion.
While these three criticisms distract from the value of Gavioli’s text, the book remains an excellent and concise analysis of the benefits and approaches to corpus studies in ESP classrooms. Perhaps too brief to be a course book on its own, even for an undergraduate class, ''Exploring Corpora for ESP Learning'' is certainly an accessible, interesting, and insightful text that all teachers in any walk of corpus studies should consider.
REFERENCES
Carter, R (1998). Orders of reality: CANCODE, communication and culture. ELT Journal, 52, 43-56.
Flowerdew, J. (1993). Concordancing as a tool in course design. System, 21, 231-243.
Graesser, A., McNamara, D. S., Louwerse, M., & Cai, Z. (2004). Coh-Metrix: Analysis of text on cohesion and language. Behavioral Research Methods, Instruments, and Computers, 36, 193-202.
Johns, T. (1991). Should you be persuaded: Two examples of data-driven learning. In Johns & King (Eds.), 1-6.
Partington, (1998). Patterns and Meaning. Amsterdam: John Benjamins.
Pennebaker, J. W. & Francis, M. (1999). Linguistic Inquiry and Word Count: LIWC. Mahwah, NJ: Erlbaum.
Sanz, C. (Ed.). (2005). Mind and context in adult second language acquisition: methods, theory, and practice. Washington, D.C.: Georgetown University Press.
Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford: OUP.
Sinclair, J. (1996). The search for units of meaning. Textus, 9, 75-106.
Tzeng, Y., van den Broek, P., Kendeou, P., & Lee, C. (2005). The computational implementation of the Landscape Model: Modeling inferential processes and memory representations of text comprehension. Behavioral Research Methods, Instruments & Computers, 37, 277-286.
Widdowson, H.G. (1998). Context, community and authentic language. TESOL Quarterly, 32, 705-716.
Zorzi, D. (2001). The pedagogic use of spoken corpora: Learning discourse markers in Italian. In Aston (Ed.), 85-107.
|
| |
ABOUT THE REVIEWER:
ABOUT THE REVIEWER
Philip McCarthy is a research scientist at the Institute for Intelligent
Systems, Department of Psychology, at The University of Memphis. His main
research interests are developing and testing algorithms for textual
analysis. McCarthy also teaches a variety of Linguistics’ courses and has
over 10 years experience in teaching English as a foreign language.
|
|
|
|
|
|