Review: Applied Linguistics; Language Acquisition; Psycholinguistics; Sociolinguistics: Isaacs, Trofimovich (2017)

Date: 18-Jul-2017
From: Joan Bajorek <>
Subject: Second Language Pronunciation Assessment
EDITOR: Talia Isaacs
EDITOR: Pavel Trofimovich
TITLE: Second Language Pronunciation Assessment
SUBTITLE: Interdisciplinary Perspectives
SERIES TITLE: Second Language Acquisition
PUBLISHER: Multilingual Matters
YEAR: 2017

REVIEWER: Joan Palmiter Bajorek, University of Arizona

REVIEWS EDITOR: Helen Aristar-Dry


Second Language Pronunciation Assessment, by Isaacs and Trofimovich, is compilation of 14 chapters dedicated to second language (L2) pronunciation assessment for researchers and educators which synthesizes knowledge from over five decades of research in this field (Isaacs & Trofimovich, 2017; Lado, 1961). Uniting a breadth of interdisciplinary work from the perspectives of Second Language Acquisition (SLA), speech and hearing sciences, psycholinguistics, English and French lingua franca communication studies, and sociolinguistics, the editors set forth goals to consolidate knowledge, modernize the field, define baseline standards, foster scholarly dialogue, and chart the course for future research.

Editors and well-established scholars in the SLA pronunciation field, Isaacs and Trofimovich, spearhead this ambitious project. The 23 other contributors of the volume comprise of researchers from the Canada, Europe, Hong Kong, and the United States. Research outlined in this volume builds upon qualitative, quantitative, and mixed method research. Academics, researchers, teachers, and language professionals are the audience of this volume. Five main sections of the book organize the chapters and “can be read in sequence or as stand-alone units” (Isaacs & Trofimovich, 2016c, p. 7)

Part 1: Introduction

This section gives a broad overview of the current state of L2 pronunciation research where it defines and operationalizes how pronunciation is assessed via machine and human measures.

In Chapter 1, “Key Themes, Constructs and Interdisciplinary Perspectives in Second Language Pronunciation Assessment,” the editors outline the scope of the book, the authors’ goals, the book’s intended audience, the organizational scheme of the book, and the definition of its core terminology. Thus, this chapter provides a backbone and framework for the volume.

Clarifying the terms used in the book’s title and throughout the work, the authors define “second language” in the “broadest possible sense” involving situations where the language is non-native (Isaacs & Trofimovich, 2016c, p. ii). “Pronunciation” is defined as individual consonant and vowel segments and larger units that encompass “word stress, rhythm and intonation” (Isaacs & Trofimovich, 2016a, p. 9). “Assessment” refers to elicitation of linguistic performance as delineated by scores and numerical measures.

In Chapter 2, “What Do Raters Need in a Pronunciation Scale? The User’s View,” Harding uses data from a focus group to consider the usability of the “Phonological control scale” of the Common European Framework of Reference for Languages (CEFRL) from the perspective of raters using the scales (Harding, 2016, p. 15). The goal of this study was to consider the problematic aspects of the scales and set forth generalizable principles for future pronunciation scales.

Building off of a previous analysis of the International English Language Testing System (IELTS) (Yates, Zielinski, & Pryor, 2011), Harding elicited data from “nine experienced raters” and identified four major themes that influenced scale usability (Harding, 2016, p. 18). These themes included clarity/coherence, conciseness, intuitiveness, and theoretical currency. Specifically, terms such as “natural” and “noticeable” were deemed ambiguous and potentially anachronistic in nature. Harding concludes with technical recommendations for improving the CEFRL standards and future assessment tools (Harding, 2016).

Part 2: Insights from Assessing Other Language Skills and Components

This section of the book draws upon insights from other fields of L2 assessment including fluency, writing, and listening that provide data that can help to inform future research.

In Chapter 3, “Pronunciation and Intelligibility in Assessing Spoken Fluency,” Browne and Fulcher investigated fluency and the interaction of 87 raters’ familiarity with the L2 accent and rating of L1 Japanese learners of English. The authors problematized the concepts of “fluency” and “intelligibility” as subjective constructs relating to the idea of “nativeness,'' all three of which are terms without agreed-upon definitions in the field.

When the 87 raters of varied home countries and familiarity with the Japanese language rated five L2 English learners, scores for pronunciation and intelligibility were correlated with listener familiarity. Raters who had more experience with L1 Japanese, L2 English learners were more severe in their grading style than those with less experience with this demographic. Since “familiarity” was a key factor in rating rather than student production, future rating of learner pronunciation might benefit from objective computer automated speech rating rubrics.

In Chapter 4, “What Can Pronunciation Researchers Learn From Research into Second Language Writing?” Knoch discusses how advances in L2 writing assessment can aid pronunciation research. The chapter covers, “(a) rating scale development and validation; (b) rater effects and rater training; (c) task effects; and (d) issues in classroom-based assessment” to provide background and inform for future research in this field (Knoch, 2016, p. 54).

Work on L2 writing can be translated into L2 pronunciation assessment contexts. These aspects include the development of rubrics and rating scales, “holistic and analytical scales,” “corpus linguistic techniques,” mixed-methods approaches to validation of scoring, and peer feedback, (Knoch, 2016, pp. 55, 57). The author calls for an expansion of approaches to L2 pronunciation assessment that incorporates varied techniques and considers the roles and backgrounds of the rater, speaker, and peer interlocutors in classroom contexts.

In Chapter 5, “The Role of Pronunciation in the Assessment of Second Language Listening Ability,” Wagner and Toth explored the features of pronunciation that influence scripted and unscripted L2 listening assessment of L2 Spanish learners in survey data. The authors consider characteristics of authentic and inauthentic spoken production including connected speech, planned speech, “spoken grammar,” vocabulary from oral genres, “rate of speech,” and aspects of hesitation (Wagner & Toth, 2016, p. 76).

In this study, L2 listeners were able to differentiate between authentic, unscripted audio and inauthentic, revised audio. The authors found that authentic listening materials were more difficult for L2 listeners’ comprehension due to the unplanned nature of the speech that included “elision, intrusion, assimilation, juncture,” and other spontaneous speech features (Wagner & Toth, 2016, p. 76). If the pronunciation goals of L2 students include intelligibility, then spontaneous, unscripted, authentic L2 listening materials should be prioritized in L2 classroom and test-taking environments.

Part 3: Perspectives on Pronunciation Assessment from Psycholinguistics and Speech Sciences

This section explores the psycholinguistic and speech sciences aspects of pronunciation, problematizing the “objective and subjective” empirical measures by which pronunciation is measured (Isaacs & Trofimovich, 2016a, p. 7).

In Chapter 6, “The Relationship Between Cognitive Control and Pronunciation in a Second Language,” Mora and Darcy investigated “the relationship between cognitive control and L2 pronunciation accuracy” (Mora & Darcy, 2016, p. 112). In this study of 49 L2 English learners, the authors explored the “attention control, phonological short-term memory,” and inhibitory language control as related to assessed L2 pronunciation (Mora & Darcy, 2016, p. 98).

Participants from Seville and Barcelona, Spain were grouped into four groups as related to their language backgrounds: monolingual, bilingual, balanced bilingual, and unbalanced bilingual. The study included a production task, perception task, attention control task, inhibition task, and vocabulary task. Building from psycholinguistic linguistics studies of multilinguals (Bialystok, 2011), data from this study suggest a positive correlation between working memory capacity and target-like L2 production. The authors state, “better phonological memory capacity and stronger attention control produced more target-like” vowel durations (Mora & Darcy, 2016, p. 112). This research is correlational, not explanatory in nature, but has implications for future research in this vein.

In Chapter 7, “Students’ Attitudes Towards English Teachers’ Accents: The Interplay of Accent Familiarity, Comprehensibility, Intelligibility, Perceived Native Speaker Status, and Acceptability as a Teacher,” Ballard and Winke approach L2 pronunciation from a sociolinguistic framework to investigate students’ perceptions, desires, and motivations related to the accents of L2 teachers. Students are key stakeholders in language education and thus their perceptions significantly impact the language field. Accent bias and the “native speaker fallacy” affect perception about the quality of instructors and their employment (Ballard & Winke, 2016, p. 134).

The authors collected data from a survey with audio clips and Likert scale judgments of 157 participants. Data indicate that NNSs struggled to identify the native/non-native status of speakers (68% correct) and accent of origin (26% correct) (Ballard & Winke, 2016, p. 130). The authors found a positive correlation between perceived teacher acceptability and accentedness, intelligibility and comprehensibility ratings. This study bolstered previous work on accent stereotyping and listener perceptions (Arteaga, 2000; Derwing & Munro, 1997; Gluszek & Dovidio, 2010; Kang, 2012; Markley, 2000; Scales, Wennerstrom, Richard, & Wu, 2006), where non-native accent significantly and negatively impacts other attitudes of students consciously or implicitly. Findings suggest a need for longitudinal perception data in L2 classrooms and ratings from different L1 backgrounds.

In Chapter 8, “Re-examining Phonological and Lexical Correlates of Second Language Comprehensibility: The Role of Rater Experience,” Saito, Trofimovich, Isaacs, and Webb discuss the connection between rater experience, comprehensibility ratings, and aspects of L2 speech. The study considered “expert” and 5 “novice” raters as they rated 40 short audio recordings and transcribed production of L1 French, L2 English speakers (Saito, Trofimovich, Isaacs, & Webb, 2016, pp. 144-145)

Echoing the findings of Ballard and Winke’s work on rater “familiarity” (2016, p. 121), the data from this study suggest that “expert” raters with explicit coursework and training in L2 speech assessment were more lenient in their rating of comprehensibility than their “novice” counterparts. By “evaluating both audio samples and transcriptions of speech,” the authors found evidence that raters prioritized word stress over segmental accuracy and several lexical variables, including vocabulary diversity, polysemy, lemma appropriateness, and morphological accuracy (Saito et al., 2016, p. 150). This “difference in rater behavior” of the same student production supports the idea that rater background should be considered for consistency in L2 assessment.

In Chapter 9, “Assessing Second Language Pronunciation: Distinguishing Features of Rhythm in Learner Speech at Different Proficiency Levels,” Galaczi, Post, Li, Barker, and Schmidt explored rhythm and L2 speech as a function of learner proficiency levels. Analyzing the audio files of 20 L2 English learners, this study considered five rhythmic properties of the learner speech including vowel reduction, syllable structure complexity, durational marking of accentuation, and final lengthening of syllables (Galaczi, Post, Li, Barker, & Schmidt, 2016, p. 160).

Results from the study suggest a positive correlation of oral production features and higher CEFR proficiency levels of the students, namely faster speech rate and appropriate durational patterns for syllables at different positions. Learners with higher proficiency had more fluency in their pronunciation than learners with lower proficiency. Rater L1 backgrounds (German, Spanish, and Korean) were a significant factor of comprehensibility judgments of the L2 English speech, echoing the findings of Browne and Fulcher in Chapter 3. Thus, it is recommended that learner speech rate and durational patterns as well as rater background should all be controlled for in further research.

Part 4: Sociolinguistic, Cross-cultural and Lingua Franca Perspectives in Pronunciation Assessment

This section explores the contemporary sociolinguistic, multilingual, and cross-cultural implications of pronunciation teaching and assessment.

In Chapter 10, “Commentary on the Native Speaker Status in Pronunciation Research,” Davies debates the term “native speaker” and why it is “both contentious and necessary” (2016, p. 185). In this brief chapter, Davies discusses how the term “native speaker” in no way encompasses the range of high-level speakers, i.e. early childhood exposure by birth, native user of a language, exceptional learners, and long residency in an adopted language society and culture.

According to the author, the term “native speaker” is “more political than linguistic” and from a “postcolonial, even racist” framework (Davies, 2016, p. 186), that is “pointless” (Chomsky, 1986, p. 57) and merely a “mystique” (Ferguson, 1983, p. vii). Problematized by prestige accents, “markers of status and class,” and a lack of a universally accepted standards for most languages, Davies claims that “native speaker” is fundamentally related to sociolinguistics and “national and ethnic identity” (2016, pp. 187, 189; Eades, 2005). Davies concludes by supporting Moyers’ call for the disambiguation of the terms “accent,” “intelligibility,” and “proficiency” as they exist today in modern society as frequent tools of discrimination, racism, and classism (Davies, 2016, p. 190; Moyer, 2013, p. 172).

In Chapter 11, “Variation or ‘Error’? Perception of Pronunciation Variation and Implications for Assessment,” Lindemann investigates linguistic variation through a discussion of the term “standard” as applied to English and the perception of variation in native and nonnative speech. A large amount of variation naturally occurs in pronunciation as influenced by “social class, ethnicity, gender, age,” “region of origin,” “interlocutors,” and “status of the statement,” among others (Lindemann, 2016, p. 194).

Lindemann argues that accent perception is a conduit for stereotyping, discrimination, and racism against nonnative speakers and less prestigious dialectal varieties. Even in circumstances where speakers are intelligible and miscommunications are infrequent, speakers are “heard differently” depending on the perceived identity of the speaker, native or nonnative (Lindemann, 2016, p. 200). This bias is “especially against that spoken by non-White speakers” and reveals underlying prejudice of listeners (Lindemann, 2016, p. 201). Additionally, “social attractiveness qualities,” ratings of competence, educational background, friendliness, integrity, intelligence, and pleasantness are all influenced by the hypothetical identity of the speaker. The author makes recommendations for screening examiners of high stakes tests of intelligibly.

In Chapter 12, “Notes on Teacher-Raters’ Assessment of French Lingua Franca,” Kennedy, Blanchet, and Guénette conducted a case study of pronunciation assessment through qualitative measures of experienced L2 French teacher-raters. The authors highlight the importance of French worldwide as a Lingua Franca, noting the over 40 countries where it is habitually spoken. It is projected that the population of French-speaking Africa “will double by 2060,” augmenting the importance of French language research.

The four constructs of accentedness, comprehensibility, fluidity, and communicative effectiveness were studied in respect to qualitative data of 4 teacher-raters. Data collected indicate that teacher-raters made explicit distinctions between accentedness and comprehensibility measures. Despite their extensive training in phonetics and pronunciation assessment, there was subjective bias in rating norms. The authors noted the paucity of research on L2 pronunciation assessment in regards to raters and called for more studies of this nature, especially in respect to the French language.

In Chapter 13, “Pronunciation Assessment in Asia’s World City: Implications of a Lingua Franca Approach in Hong Kong,” Sewell calls for a “lingua franca approach” to English pronunciation where intelligibility is the key factor on which to value and center (2016, p. 237). The author advocates a feature-based approach that evaluates pronunciation based on priorities of functional loads combined with “pragmatic awareness of local and global influences” (Sewell, 2016, p. 252).

In a global society where English is a Lingua Franca, World Englishes and associated pronunciation variation is subject to the “effects of superdiversity” (Sewell, 2016, p. 237). In international contexts, such as Hong Kong, where English pronunciation varies highly, the author considers what features are important for teaching and assessment. While some features are less essential for intelligibility, they still hold significant sociolinguistic weight, i.e. interdental fricatives, vowel length, and nuclear stress. The author concludes that in the context of pronunciation, acceptance of variation and diversity, as well as “adaptability and flexibility,” “are paramount” to listener and speaker accommodation (Sewell, 2016, p. 252).

Part 5: Concluding Remarks

In Chapter 14, “Second Language Pronunciation Assessment: A Look at the Present and the Future,” Trofimovich and Isaacs provide a general overview of all of the chapters comprised in the volume, summarizing their most important insights and contributions to the field. The authors note that current trends in the L2 pronunciation assessment field include greater focus on intelligibility, specified descriptors of linguistic features that influence comprehensibility, emphasis on pronunciation standards and raters, the importance of lingua franca pronunciation in today’s global world, the sociolinguistic implications of the terms “native” and “nonnative,” the racism and discrimination intertwined in pronunciation biases, and the consideration of individual differences.

The authors also note that the L2 pronunciation assessment field can benefit from the advances in assessment in L2 writing and L2 listening, specifically including “self-assessment,” “paired assessment,” and authentic versus modified pedagogical materials (Isaacs & Trofimovich, 2016b, p. 264). Variability of L2 pronunciation performance can also be considered from a psycholinguistic viewpoint where vast individual differences in cognitive variables, such as attention control, short-term memory, and inhibitory control, provide evidence against a “one-size-fits-all” model for pronunciation teaching and assessment (Isaacs & Trofimovich, 2016b, p. 265).

The summary chapter concludes with a series of questions which delve into the nature of stakeholders, the role of technology, screening for rater bias, which features to prioritize, ecological validity, authentic materials, the theorization of language learning, form-focused instruction, the integration of pronunciation into other pedagogical material and research, and literacy practices among other issues.


Valuable Resource

This volume is invaluable to the L2 pronunciation assessment field and its future directions. Organizing and synthesizing knowledge from over five decades of research from several perspectives, the content is useful across a wide variety of contexts. The introduction is strong, vibrant, and linear, giving a broad overview of topics to be covered that help to orient readers to which chapters are most pertinent for them, whether for research, assessment purposes, or graduate seminar settings.

Consensus and Scope

One of the greatest strengths of this volume is the diversity of topics covered, SLA, speech sciences, psycholinguistic and sociolinguistic viewpoints of L2 pronunciation assessment. Most of the authors’ main arguments center around five main topics: A Call for Better Rubrics, Explicit Rater Background Information, Broader Understandings of the Term “Native” Speaker, Focus on Intelligibility, and Theoretical Frameworks. Most contributors of this volume agreed that assessment rubrics needed to be more specific and complete in descriptions. They called for greater consideration of rater background and explicit training practices to minimize bias and inter-rater reliability. There were also calls for broader definitions of the terms “native” speakers, a greater focus of the field on intelligibility over native-speaker constructs, and the need for theoretical frameworks in the field. The resounding consensus was a need for greater focus on L2 pronunciation assessment and for that emphasis ripple throughout teaching material, assessment practices, and intelligibility of pronunciation as the goals of students and instructors.

Editing Issues: Organization and Clarity

This volume would have benefitted from stronger editing for structure, organization, and clarity of concepts. The editors write that chapters “can be read in sequence or as stand-alone units” (Isaacs & Trofimovich, 2016c, p. 7). However, titles for the chapters were often vague and too broad. For example, scholars researching the CEFR scales would not know that Harding in Chapter 2 would even discuss them because neither the title nor paragraph of introduction mentions it, instead choosing the phrasing “a particular pronunciation scale” (Harding, 2016, p. 12). The ability for research to be generalizable is helpful, however obfuscation is not. This style and organization of writing can hinder potential readers from finding and interpreting important content efficiently. Abstracts or short summaries at the beginning of each chapter would have been helpful.

Almost Exclusively about English

One of the shortcomings of this volume is the heavy emphasis on English. This lack of linguistic variety is fully acknowledged by the editors who write, “research into the assessment of pronunciation in languages other than English is virtually non-existent” (Isaacs & Trofimovich, 2016b, p. 266). The notable exception is the treatment of French in this volume (Kennedy, Blanchet, & Guénette, 2016). However, even this inclusion of French in North America maintains a white, Western, academic-context norm. Forthcoming research in multilingual and diverse contexts should be better covered in the future.

Scant Coverage of Technology

An additional gap in knowledge was the lack of L2 assessment technology in this volume. If the editors indeed want to “include state-of- the-art” material of this field, the paucity of coverage of this topic is problematic (Isaacs & Trofimovich, 2016c, p. 6). If the editors note that there is currently a “rigorous field-wide debate on machine-mediated automated scoring in the first decade of the 21st century,” then at least one chapter out of 14 should have discussed it at length (Isaacs & Trofimovich, 2016a, p. 5). Automatic Speech Recognition (ASR) is already prevalent in the field for TOEFL tests and Pearson’s Versant tests (Isaacs & Trofimovich, 2016b). A wide variety of research is being conducted on ASR in pedagogical contexts (Hincks, 2003; Kim, 2006; Liakin, Cardoso, & Liakina, 2015; Wang & Young, 2015). One chapter of this volume noted that rampant human rater bias and variation in cognitive abilities have direct implications for objective assessment measures such as ASR software to “play a central role” in future assessment (Galaczi et al., 2016, p. 179). However, this was a small section in the discussion section of the chapter. This lack of content about this L2 pronunciation assessment technology will be a disappointment to some readers and a motivator for others for future research in this field.

Overall, this volume is an invaluable resource to those in the L2 pronunciation assessment field and those who are new to the topic. In these chapters, decades of research about the topic from varying perspectives are analyzed and organized to outline future research agendas. Many researchers will rely on this as the future of this field continues to blossom in the 21st century.


A Portland native, Bajorek is a doctoral student in the Second Language Acquisition and Teaching Program at the University of Arizona (UA). A Finalist in the UA Grad Slam Competition in 2017, her research explores data-driven language education and pronunciation development, in the vein of Rosetta Stone and Duolingo. Bajorek earned an MA in Linguistics in 2016 from the University of California, Davis, where she worked as an Associate Instructor of French. She is the lead curriculum designer for a language education juvenile prison project in Tucson. After her PhD, Bajorek plans to work in language technology research and development.

