Review of  Assessing Foreign Language Student's Spoken Proficiency

Reviewer: Hye Jin Yang
Book Title: Assessing Foreign Language Student's Spoken Proficiency
Book Author: Martin East
Publisher: Springer Nature
Linguistic Field(s): Applied Linguistics
Issue Number: 27.4156

Reviews Editor: Robert A. Cote


This book, “Assessing Foreign Language Students’ Spoken Proficiency: Stakeholder Perspectives on Assessment Innovation” by Martin East is a report of assessment innovation in the evaluation of senior high school students’ spoken proficiency in foreign languages (FL) in New Zealand. The implementation of a new high-stakes assessment system—the National Certificate of Educational Achievement (NCEA)—was launched in 2002. Subsequently, NCEA lead a reform of foreign language assessments emphasizing more on learning potential of peer-to-peer interaction in tests. The book reports how a new foreign language assessment, called ‘interact’, was developed as a result of nationwide curriculum reforms. The book also presents usefulness of the new test as a validation process by uncovering stakeholders’ perspectives on the new test (‘interact’) compared with the previous test (‘converse’) based on Stage I and Stage II of two years of research.

Chapter 1, “Mediating Assessment Innovation: Why Stakeholder Perspectives Matter,” provides an overview of the entire book. It begins with rationales about the value of communicative language teaching (CLT), which serve as a theoretical background for curriculum and foreign language assessment reforms in New Zealand. It explains the limitations of conventional score-based evidence to validation argument. To determine the usefulness of a test, the author highlights the necessity for the use of stakeholder perspectives (teachers and students) on interact, which can be used to inform its validity arguments. Details about the stakeholders’ views are addressed in the following chapters.

Chapter 2 entitled “Assessing Spoken Proficiency: What Are the Issues?” acquaints the reader with the theoretical foundation for the development of ‘interact’ by addressing several issues that should be considered for assessing foreign language students’ spoken communicative proficiency. It discusses what it means to speak proficiently and the way to define a construct in speaking tests. The author goes on to explain different paradigms of assessments, such as static or dynamic/summative or formative. In addition, the author notes whether the test outcome is task-based or construct-based. This chapter concludes with different test formats (single or paired/group performances) to measure speaking proficiency adequately.

Chapter 3, “Introducing a New Assessment of Spoken Proficiency: Interact,” introduces New Zealand’s revised curriculum and assessment reforms as well as how the reforms provide a significant influence on the innovation for foreign language assessments. With concerns about necessities for a learning assessment and a new assessment criteria aligned with a new curriculum, this chapter presents detailed procedures and extensive information for assessment reforms and ‘interact’. In contrast to the previous test (‘converse’), ‘interact’ intends to elicit more spontaneous, unrehearsed, peer-to-peer interactions during the test. The latter part of this chapter presents a revised assessment matrix and the crucial changes between ‘converse’ and ‘interact’, focusing on the significant goals of ‘interact’ in line with curriculum expectations.

Chapter 4, “Investigating Stakeholder Perspectives on Interact,” delves into the specific methodology for the two-stage empirical study on stakeholders’ (teachers and students) perspectives on ‘interact’ in comparison with ‘converse’. The main instruments for this study, surveys and interviews, were administered to explore stakeholders’ opinions about the usefulness of ‘interact’, which were evaluated against six qualities of test usefulness framework (Bachman & Palmer, 1996)—construct validity, reliability, interactiveness, impact, practicality, and authenticity. This chapter elaborates the contents of the two main instruments, the procedures of implementations, and the analyses of the collected data during the initial phases of the operationalization of ‘interact’ (2012-2013).

Chapters 5 entitled “The Advantages of Interact”, and Chapter 6, “The Disadvantages of Interact and Suggested Improvements,” provide findings from Stage I of the study, which explored foreign language teachers’ responses to ‘interact’ collected through a nationwide survey and interviews with the participants. Chapter 5 focuses on the advantages of ‘interact’ in comparison with ‘converse’. Based on the teacher survey (n=152) and interviews with teachers who used ‘interact’ (n=14), the findings revealed several advantages of the new test. In general, teachers commented that ‘interact’ generated more natural, spontaneous, authentic interactions across a variety of topics instead of emphasizing linguistic accuracy, which, consequently, contributed to enhanced validity and positive washback. In contrast, Chapter 6 demonstrates the disadvantages of ‘interact’ identified from the teacher survey and interview data, along with their suggestions for improvements. Above all, impracticality stood out as a clear limitation because ‘interact’ is time-consuming to administer and unrealistic to gather evidence. The other limitation was an increase in workload for both teachers and students. The negative impacts of ‘interact’ are relevant to the perceived unrealistic demands of the assessment (limitations on spontaneous and unrehearsed interactions when taking into account students’ proficiency levels), and the potential unfairness of interlocutor variables. Considering the aforementioned challenges, suggested improvements included a decrease in the number of interactions required, the provision for scaffolding and rehearsal during the test, and more examples/flexible options. Additionally, a more explicit direction and definition for spontaneous and unrehearsed speech are suggested.

Chapters 7, “Interact and Higher Proficiency Students: Addressing the Challenges,” and Chapter 8, “Interact and Higher Proficiency Students: Concluding Perspectives,” present findings from Stage II of the study, focusing in particular on three emerged issues from Stage I of the study: (1) nature of the task, (2) issues around spontaneity, and (3) place of accuracy (grammar). Data collected for this stage included interviews with teachers (n=13) using ‘interact’ at NCEA level 3 (the highest level), surveys administered to Year 13 students taking either ‘interact’ at level 3 (n=119) or ‘converse’ at level 3 (n=30). Above all, Chapter 7 taps into the teachers’ reflections on ‘interact’ at the highest level in comparison with ‘converse’. In this chapter, 13 teachers’ responses to interviews are reported and their direct quotes are narrated. The main issue is some tasks of ‘interact’ fail to promote interactions among students due to their complexities. Teachers’ suggestions for improvements in tasks are introduced, such as tasks relevant to current events or task types allowing for more spontaneous speech instead of relying on pre-learned materials or particular grammar learned in class.

Chapter 8 reports on the perceptions of both teachers and students towards ‘interact’ in comparison with ‘converse’. It begins with teachers’ opinions about the issues of washback of ‘interact.’ Quotes from teachers who used ‘interact’ in their classrooms revealed ‘interact’ creates positive washback for the classroom environment since it fosters more spontaneous, unrehearsed interactions among students. After the teachers’ opinions on ‘interact’ reported in this chapter, the next part turns to students’ perceptions on ‘interact’ and ‘converse’ in light of challenges and relevant issues. A noticeable finding is students perceived neither assessment was better or worse in terms of perceived usefulness of the test or the fitness for purpose. Students’ quotes relevant to both tests are included to represent their diverse opinions about their assessments.

Chapter 9 entitled “Coming to Terms with Assessment Innovation: Conclusions,” summarizes the key issues identified from the data for both stages of the study. The latter part of the chapter expands the findings to broader issues and contexts for speaking assessments. This chapter concludes with recommendations for practice, limitations of the study, and directions for future research.


This book provides an excellent report on the on-going procedure for a nationwide reform of curriculum and foreign language assessments throughout the chapters. It attracts readers’ attention by beginning with issues in previous curriculum and language assessment systems in New Zealand. The extensive description about the changes in the curriculum and their impact on the foreign language assessments are well presented, which help readers identify the issues addressed in the new test. Practical problems and issues identified in the early chapters are thoroughly linked with rationales for the foreign language assessment reforms in subsequent chapters. In addition, the author strengthened the necessity of the reforms and evaluation of a new test (‘interact’) by providing theoretical backgrounds of speaking assessments, such as communicative language teaching (CLT) and test usefulness framework (Bachman & Palmer, 1996).

The clear organization of the book is another strength, which helps readers easily follow the entire story of assessment reforms and the empirical study conducted at the initial phase of ‘interact’ implementation. For example, this book begins with background information about New Zealand (Chapter 1) and theoretical background about speaking assessments (Chapter 2), allowing readers to grasp the situations for New Zealand. After detailed explanations on ‘interact’ (Chapter 3), the methodology of the two-year study is extensively explained (Chapter 4). The following four chapters are the results of the two-year study from Chapter 5 through Chapter 8, which are also explicitly presented with figures and tables. Chapter 9 synthesizes the findings from the study that bring everything together to make a final conclusion. Furthermore, each chapter begins with the purpose of the chapter and a review of the preceding chapters, reminding readers of the key points the author intends to present before presenting the new topic. The conclusion section of each chapter successfully announces the main issues and findings of the study presented in the chapter.

This book also contributes to the field of language assessments as it provides a successful example of exploring qualitative evidence for the validation study. Since the author pinpointed the limitations of score-based evidence to validation in Chapter 1, qualitative approaches to language test validation have made significant impact in the field (Lazaraton, 2002). In this sense, this book provides informative resources to other researchers of language assessments, due to its extensive, precise descriptions about the study. For example, in the methodology chapter (Chapter 4), the procedures for data collection and data analysis are elaborated chronologically. Furthermore, stakeholders’ perspectives on ‘interact’ are comprehensively presented with a summary of findings and discussions. The author’s discussion on possible future research and directions are also of great value for other researchers in language assessments.

While this book effectively presents the development of a new test along with stakeholders’ perspectives, adding more discussion of washback linked to the validation framework could further strengthen the book. Although the author addressed this issue in Chapter 8 (Section 8.2 Working for Washback, p. 168), this section focused on presenting several teachers’ responses to ‘interact’, but seems insufficient to synthesize the overall results to make a conclusion about positive and negative washback for assessment and curriculum.

All in all, this book will be of great interest to education policymakers and practitioners/researchers of language assessments. The assessment reforms led by the curriculum innovation could provide a useful guidance for education policymakers in other contexts. The descriptions about ‘interact’ and the following validation research are also very practical and useful. Finally, foreign language teachers will benefit for future knowledge enhancement.


Hyejin Yang received her Ph.D degree from Iowa State University, USA. Her research interests include language assessment, computer-assisted language learning (CALL), and L2 speaking and writing instruction and assessment. She taught writing and speaking classes for international undergraduate and graduate students for several years in the USA. She has presented her research at professional conferences such as MwALT, LTRC, and CALICO, etc, and published her work in System and CALICO journal.