Date: Mon, 14 Nov 2005 09:52:46 -0800 From: Esmat Babaii Subject: Washback in Language Testing: Research Contexts and Methods
EDITORS: Cheng, Liying; Watanabe, Yoshinori; Curtis, Andy TITLE: Washback in Language Testing SUBTITLE: Research Contexts and Methods PUBLISHER: Lawrence Erlbaum Associates YEAR: 2004
Esmat Babaii, Department of Foreign Languages, University for Teacher Education, Tehran, Iran
A decade after the scholarly serious appraisal of the notion of washback pioneered by Alderson and Wall (1993), it is timely to have a new publication which looks carefully at washback from both theoretical and empirical perspectives. Washback in Language Testing is a well-balanced and informative collection of various articles representing different methodological frameworks and offering different positions and research findings about washback in language testing.
The book consists of two parts. Part One, which consists of three chapters, introduces the concept of and theoretical arguments about washback in language testing and presents the most commonly methodological frameworks used to investigate this complex phenomenon. Part Two includes Chapters Four to Eleven. It reports the results of several empirical studies on washback.
In the 'foreword' to this volume, Charles Alderson, while providing a useful brief background and explaining how he became involved in investigating washback, questions what he calls a "Messickian view" (p. xi, see also Messick, 1996) of engineering positive washback by test design. He then calls for a multi-dimensional treatment of washback phenomenon which considers washback not as a direct effect of test in itself, but as a result of the interaction of numerous factors existing in the educational system.
In Chapter One: 'Washback or backwash: A review of the impact of testing on teaching and learning', Cheng and Curtis offer a collection of different outlooks on washback and suggest that instead of being very much concerned about the positive and/or negative direction of washback, it is more plausible to consider the complexity and intensity of the phenomenon and explore the intricate causes of it in a given educational community. The success or failure of assessment-driven reform, they add, is not necessarily guaranteed beforehand. Rather, it will mostly depend on the inner dynamic of the system of education.
In Chapter Two: 'Methodology in washback studies', Watanabe proposes a qualitative approach to investigating washback due to its complex rather than monolithic nature. He suggests a conceptualization of washback that centers round these dimensions: specificity, intensity, length, intentionality, and value, with a particular emphasis on analyzing the aspects of learning and teaching that are often influenced by the test, on one hand, and the factors that mediate the process of washback, including test facets, personal factors, and micro-/macro-contextual factors, on the other hand. He then provides detailed guidelines on how qualitative research on this area can be conducted, from designing research to selecting the participants, analyzing the data and interpreting the results. This chapter, it seems to me, is one of the best contributions to this collection. The points discussed in this chapter can lucidly serve as a set of criteria with reference to which one may judge the validity of the empirical studies on washback including those presented in the second part of the book.
Chapter Three: 'Washback and curriculum innovation' by Andrews, examines the assertions made about the nature of relationship that exists between curriculum innovation and washback. It offers a review of the available relevant empirical pieces of evidence for and against this link. Far from being a simple yes or no, he concludes, the influence of high-stakes tests on curriculum should be investigated through a careful examination of various niceties of the educational community and the changes introduced are often found to differ in type, depth, and complexity.
In Chapter Four: 'The effects of assessment-driven reform on the teaching of writing in Washington State', Stecher, Chun, and Barron report a study conducted to document the changes at both school and classroom levels during the early years of Washington Educational Reform which, among other things, introduced Washington Assessment of Student Learning (WASL) as a new assessment mechanism focusing on writing skill. Teachers and principals participating in the survey report changes in the allocation of time, emphasis on different aspects of learning, content and method of teaching, and students' learning activities. Replacing multiple-choice tests with more performance-based assessments appears to lead to an increase in the amount of writing students do in schools. Reliance on self-reports provided only by the teachers and principals, and not the students, nevertheless, can put this research and its findings in a rather vulnerable position.
Saville and Hawkey in Chapter Five, 'The IELTS impact study: Investigating washback on teaching materials', mostly present a research-in-progress report and focus on the data-collection instrument development phase of a large-scale, multi-phase IELTS impact study. They provide a detailed description of the instruments and procedures employed including user survey and structured group interviews with students, teachers and oral examiners, along with ratings of test practice books and materials. Early evidence points, it seems, to the authenticity of texts in test-related books as a beneficial effect of IELTS. In another study on the washback effect of IELTS reported in Chapter Six, 'IELTS test preparation in New Zealand', Hayes and Read attempt to examine whether IELTS test preparation courses in New Zealand show any evidence in support of washback related to this high-stakes test. Classroom observations, teacher interviews, teacher and student questionnaires, and pretests and posttests of students were the data collection tools used to investigate the nature of the two preparation courses. The obtained findings indicate drastic differences in the classroom practices offered to prepare students for IELTS implying that different instructors follow different methodologies to deal with the intended test tasks and materials.
In a study of washback effect in the Australian Adult Migrant English Program reported in Chapter Seven, 'Washback in classroom-based assessments', Burrows documents the consequences of introducing the Certificate in Spoken and Written English (CSWE) backed up by data collected through questionnaires, interviews and classroom observations. The results of this study reveal that different teachers are affected differently by the new competency-based assessment system. In fact, Burrows categorizes the teachers' reaction to CSWE into four types: resister, adopter, partial adopter, and adaptor. In an interesting discussion, she criticizes the traditional view of washback which considers it to be a single, uniform response to a given test. Instead, she proposes a new model for washback "which takes into account teachers' belief systems and consequent responses to change" (p.125). A fruitful analysis, however, may address these observed variations in terms of their patterns of behavior rather than bewildering individual and idiosyncratic responses. It appears to me that the introduction of this model, which proposes seeking a pattern rather than being lost in the diversities, singles out this contribution as a turning point in the study of washback.
Results of the research into the washback effect of the English component of Japanese University Entrance Examination are presented in Chapter Eight, 'Teacher factors mediating washback'. In this study, Watanabe has collected the data through classroom observations and interviews with teachers. He concludes that the negative picture of the effect of the Entrance Examination as depicted by Japanese mass media does not truly reflect what is happening in the classrooms. The test, he adds, has both positive and negative washback effects. He also holds that the effect of the test is mediated by teachers' psychological factors and school cultures. Instead of a top-down approach to curriculum innovation, he calls for efforts directed towards some types of teacher training in order to introduce changes at the level of individual teachers. The author attracts our attention to many interesting points. However, as also pointed out by Watanabe himself "teachers were informed of the purpose of research" (p.133) and this can weaken the validity of his findings.
In another study by Cheng reported in Chapter Nine, 'The washback effect of a public examination change on teachers' perceptions toward their classroom teaching', the washback effect of the new (1996) Hong Kong Certificate Examination in English (HKCEE) is investigated. HKCEE is designed to encourage more task-based teaching practices in Hong Kong. Analysis of teacher questionnaires and classroom observations reveal that teachers are reluctant to make fundamental changes in their daily practices, although their reactions to the test are positive. Based on these findings, Cheng concludes that changes in the educational system appear to be superficial rather than substantial and that a change in the examination alone is unlikely to fulfill the intended purposes of test designers and policy makers. These findings are further supported by Qi's study reported in Chapter Ten, 'Has a high-stakes test produced the intended changes?' It examines the intended washback effect of the National Matriculation English Test (NMET), a substitute for the old university entrance English examination in China. Through in-depth interviews and follow- up contacts with the test constructors and teachers, Qi finds out that the test mainly influences the content of teaching but not teaching methodology employed by teachers. In fact, there appears to exist only a partial match between the test constructors' intention to promote communicative use of language and the reported actual classroom practices; hence, leading the author to the conclusion that a high-stakes test may not be a good lever for change.
In Chapter Eleven, 'The washback effect of an EFL national matriculation test to teaching and learning', Ferman investigates the effect of a high-stakes test introduced as a means of curriculum innovation in the Israeli educational system. Drawing on the extensive data collected through multiple sources (structured questionnaires, structured interviews, open interviews, and document analysis) and multiple participants (teachers, EFL inspectors, as well as students), she finds out that there is a strong washback effect on the educational processes, products, and participants in Israeli high schools. The effects, however, are characterized as both positive (promotion of language skills, especially oral skills and different teaching/learning strategies) and negative (a higher level of anxiety and increased pressure to cover the materials). To me, Ferman's study is an impeccable research as it admirably meets the 'triangulation' criterion which guarantees the credibility of research (cf. Davies, 1995).
Despite the fact that some of the studies reported in this collection suffer from certain methodological defects, I think, the collection is an important contribution to our understanding of the concept of washback in language testing. The reader gathers useful knowledge about washback and, at the same time, understands the ways contextual differences in different educational systems can affect the nature of washback in reality. The research reports edited in the volume, in my opinion, should be read critically as there are a number of shortcomings in the design and data collection procedure of a few studies that may limit the generalizability of their findings.
Alderson, Charles & Wall, Diane (1993) Does washback exist? Applied Linguistics 14, 115-129.
Davies, Katharine (1995) Qualitative theory and methods in applied linguistics research. TESOL Quarterly 29, 427-453.
Messick, Samuel (1996) Validity and washback in language testing. Language Testing 13, 241-256.
ABOUT THE REVIEWER:
ABOUT THE REVIEWER
Esmat Babaii is an assistant professor in the Department of Foreign Languages at the University for Teacher Education, Tehran, Iran. She has taught graduate courses including language testing and research methods for several years. Her research interests include language testing, discourse analysis, EAP, and L2 research. She is currently the editor of the Asian EFL Journal and also a member of System review panel.