LINGUIST List 21.3465

Mon Aug 30 2010

Review: Ling. Theories; Phonetics; Phonology: Boersma & Hamann (2009)

Editor for this issue: Joseph Salmons <>

        1.    Andrew Blyth, Phonology in Perception

Message 1: Phonology in Perception
Date: 30-Aug-2010
From: Andrew Blyth <>
Subject: Phonology in Perception
E-mail this message to a friend

Discuss this message

Editor's note: This issue contains non-ISO-8859-1 characters. To view the correct characters, go to

Announced at

EDITORS: Boersma, Paul; Hamann, Silke. TITLE: Phonology in Perception SERIES TITLE: Phonology and Phonetics [PP] 15 PUBLISHER: Mouton de Gruyter YEAR: 2009

Andrew Blyth, Faculty of Arts and Design, TESOL, University of Canberra


This volume is the fifteenth in the Phonology & Phonetics series from Mouton de Gruyter (now de Gruyter Mouton), a collection of nine papers that aim to contribute to ''the interaction of phonology and phonetics within linguistics'' (back-cover). The publisher acknowledges a 'tumultuous' relationship between phonology and phonetics, and seeks to engage both fields through academic dialogue afforded in this series. Though at first this particular volume appears to focus more on phonology, the theoretical assumptions contributors rely on is a pragmatic mix of knowledge from both phonetics and phonology. Reassuringly, some very recognisable names have contributed to the nine chapters, including Ellen Broselow, Paul Boersma, and James McClelland. For me as a language teacher, this book provides some comfort for what some of us have felt might be true: that phonological knowledge assists in perception (p. 19, 60, 103; see Altenberg 2005, Celce-Murcia, Brinton and Goodwin 1994 p. 10), as opposed to the common assumption of many linguists that phonological behaviour was influenced by what was perceivable (back-cover). The editors aim to demonstrate this concept with these nine papers. Many of the contributors lean heavily on Optimality Theory as the basis of their work, as well as on Boersma's BiPhon model described in chapter two.


''Introduction: Models of phonology in perception'', by Paul Boersma and Silke Hamann, opens the book and the editors clarify the notation for both phonetic and phonological representations used. They then provide the reader with a brief historical account of the relationship between comprehension and production, where in the past, comprehension was assumed to be the reverse of production. Amusingly, Boersma and Hamann highlight the past lack of interest in comprehension models by contrasting twenty-three past and present comprehension and production models, where the comprehension column for the first eight models was filled in with a small question mark. The editors show Smolensky's (1996, cited on p. 8) bidirectional grammar model as the first to consider production and comprehension as separate to one another. Boersma and Hamann conclude with models representing the perception process as being not the same as the phonetic articulation process (termed 'phonetic interpretation' and 'phonetic implementation'), reinforcing to the reader the notion that comprehension is not the reverse of production.

Chapter one, ''Why can Poles perceive Sprite but not Coca-Cola? A Natural Phonological account'' by Anna Balas. This chapter attempts to focus on how Optimality Theory (OT) cannot fully explain perception. It begins with an anecdote about how, given American English input of /spraɪt/, Poles will repeat back [sprajt]. She then demonstrates that Poles will substitute American English diphthongs like in Coca-Cola /koʊkəkoʊlə/ with other American diphthongs, rather than Polish vowel plus glide sequences as would be expected. Balas tests how Natural Phonology (NP) and OT explain the vowel substitution problem. She first provides a phonetic description of the phenomena and compares NP with OT. Balas describes NP-based perception and underlying representations. Balas argues that NP can account for Polish listeners' perception of American English diphthongs, and demonstrates how OT cannot. In contrast to OT, which requires the listener to already have knowledge of the phonological construct of the language being uttered, NP can deal with ''uncategorised, auditory phonetic input'' (p. 46) and thus NP accounts for the Polish pronunciation of Sprite and Coca Cola.

Chapter two, ''Cue constraints and their interactions in phonological perception and production'' by Paul Boersma, has the stated aim of demonstrating ''how one can formalise the phonology-phonetics interface'' within OT and Harmonic Grammar (p. 55). Boersma reasserts a tentative model to represent the ecology of phonology and phonetics within a five-layered system called the BiPhon Model (Apoussidou 2007, and Boersma 2007). The five levels are , |underlying form|, which relate to the lexicon; /surface form/, [auditory form], are the phonetic-phonology interface; and [articulatory form] (p. 56). Boersma says that previously phonologists attempted to fit phonetic detail within the phonological levels (underlying and surface forms), but Boersma suggests that this should not be the case. The five layered system allows both phonological and phonetic theories to be represented without compromise, and bidirectionally. Boersma demonstrates the model using perception of foreign language words, including loanwords and foreign word perception. Boersma demonstrates Japanese perception of Russian [tak] ('so'; perceived as 'taku') and English 'drama' perceived as 'dorama'. He details various alternative perceptions, and explains the reasons for the failure of these alternatives using the BiPhon model. In essence, Boersma demonstrates that the perception process is restricted by the hearer's phonological constraints in the same way as the production process is restricted by phonological constraints (p. 103).

In chapter three, ''The learner of a perception grammar as a source of sound change'', Silke Hamann argues that auditory mapping cues and phonological categories differ from generation to generation. Ohala says that generational-change is phonetic, though Hamann argues that it involves both phonetic and phonological knowledge, though perception is phonological especially 'of auditory cues and their mapping onto language-specific phonological categories' (p. 111). She argues that this can be mapped on the BiPhon Model (see chapter 2). Hamann says that sound change of phonemes occurs due to younger generations assigning different weightings on cues to the previous generations' weightings. This may be due to some cues being regarded as less reliable. This would imply that phonological categories are not universal as some argue, but are 'emerging' (p. 137). Hamann also acknowledges that sound change does occur within an individual's lifetime.

Chapter four, ''The linguistic perception of SIMILAR L2 sounds'' by Paola Escudero, attempts to explain native and second language (L2) perception using the Linguistic Perception Model (LP) originally by Escudero (2005, cited on p. 155), which itself evolved from work by Boersma (1998, cited on p. 155), and Escudero & Boersma (2003, cited on p. 155). Like in previous chapters, Escudero uses linguistic arguments to explain this theory. She compares Canadian English (CE) and Canadian French (CF) and demonstrates how similar but different phonological categories can be acquired in the L2 for L2 speech perception. Escudero argues and demonstrates three main points. Firstly, L1 listeners are optimal perceivers of their own language. Secondly, L2-learner listeners initially impose their L1 perception categories on the L2. Thirdly, the L2 learner ''adjusts her L1 perception to become an optimal L2 listener'' (p. 184), which was modelled using both CE and CF. She argues well that the L2LP model is the most comprehensive account for the acquisition of L2 perception.

Chapter five, ''Stress adaptation in loanword phonology: perception and learnability'' by Ellen Broselow, attempts to explain loanword adaptation into Huave (a language from Mexico), and Fijian (a Pacific island language). She uses Huave and Spanish, and Fijian and English dichotomies to demonstrate how the respective languages impose their grammar on to the adaptation. The nature of the adaptations depends heavily on the placement of the stress in the language-of-origin, and how the borrowing language can adapt the word within their perception grammar. For instance, the final stressed syllable in 'bazaar', in Fijian, is perceived as a lengthened vowel, and is adapted in that way (p. 216). Broselow concludes that apparently unlearnable rankings are in fact a 'reflection of input frequency of the working of a perception grammar' (p. 228).

Chapter six, ''Perception of intonational contours on given and new referents: a completion study and an eye-movement experiment'' by Caroline Féry, Elsi Kaiser, Robin Hörnig, Thomas Weskott, and Reinhold Kliegl. This chapter describes two experiments done in German. The first was a sentence completion task testing new and given referents indicated by accenting. The results show a preference for completion with 'given-referents', that is, a preference for low tones. The second experiment is an eye-movement task using the same sample sentences as in the first experiment. It assumed that test participants could indicate intonation anticipation by eye movements, as they viewed a picture and heard intonation laden sentences (p. 251). It was assumed that particular intonational contours would trigger an expectation in the listener to expect an object (that appears in the picture) to be included in the remainder of the sentence. Féry et al. assumed that there would be a fixation on 'discourse-given referent' (p. 251) rather than on the new on unaccented objects. Whereas a late-fall was interpreted by the listener to expect an accented discourse-new object. The results of the second experiment indicate listeners were able to respond to intonational cues and reflect their expectations of the remainder of the sentence in anticipatory eye-movements across a picture. As a consequence, Féry et al. argue that this supports Boersma's model that includes phonetic and underlying representations. This paper concludes strongly stating that listeners of intonational languages, like German, can use intonation to predict upcoming sentential constituents.

Chapter seven, ''Lexical access, effective contrast, and patterns in the lexicon'' by Adam Ussishkin and Andrew Wedel, examines Catalan and Hebrew data to demonstrate that efficient lexical access relies on neighbourhood density, word frequency, and allomorphy (p. 287). They argue that words like 'orange' have few words that are phonologically similar, and therefore are recognised more quickly than words like 'cat', which is also similar to 'mat', 'sat', and 'at'. In the same way, words that are more frequently used are often recognised more quickly than less common phonological neighbours. Similarly, the nature of allomorphic words assist to distinguish the target word from its near neighbours. The authors acknowledge that this is a relatively new direction of research, but one that warrants further enquiry.

In chapter eight, ''Phonology and perception: a cognitive scientist's perspective'', the psycholinguist James L. McClelland reflects on and reviews a number of questions explicitly or implicitly raised in the preceding chapters. His enlightening review aids the reader to see cohesion as McClelland attempts to position this collection within the wider context of speech perception. McClelland uses the list of questions to guide the reader and explore a number of pertinent points that are not being explored by other disciplines which also have an interest in speech perception. McClelland ends the chapter with a call for a continued effort to describe the structure of languages with interdisciplinary approaches.


As shown in the chapter descriptions above, this volume represents a variety of new thought on speech perception from a phonological perspective. This collection makes clear how the perception process is not the same as nor simply the reverse of production. This volume elegantly demonstrates this, whilst also providing valuable new insights into speech perception. The BiPhon Model is the central element of this book, and forms the basis of some of the theoretical assumptions in some chapters. The BiPhon model includes phonetic principles within its design, and looks to cognitive science for some grounding.

This volume was released at a time when there is burgeoning interest into speech perception from a variety of disciplines. For instance, lexical segmentation as researched by McQueen, Cutler, Otake, and others at the Max Planck Institute for Psycholinguistics (see Otake 2006 for a review) is of great interest to some language teachers and some linguists. Similarly, contributors to this volume refer to cognitive science researchers especially McQueen & Cutler, though such citations were limited to a few paragraphs in a few chapters. However, the contributors to the volume appear not to have looked very widely at research from the Max Planck Institute. There is a near-plethora of articles produced by McQueen, Cutler and their colleagues, but this volume mostly referred to only McQueen and Cutler (1997). Perhaps consequently, some readers may find the contributors attempts to link their results with the article by McQueen and Cutler (1997) a little tenuous (see chapters 2, 3, and 6). Though the authors attempt to find compatibility with cognitive science it would be preferable to see specific research that explicitly establishes these links much more firmly.

Further, on a number of occasions, some authors refer to unpublished or yet to be published articles, which perhaps is a reflection of the newness of the ideas being presented in this volume (see p. 47, 113, 249, 261, 287). It should be of interest for some readers to note that in the first chapter Balas shows that Natural Phonology, rather than OT, appears to successfully account for the treatment of English L2 by Polish L1 listeners, while other chapters use OT.

Despite these criticisms, this volume contains exciting and potentially valuable new contributions that attempts to expand our understanding of the role of phonology and phonetics in speech perception. This volume has much to contribute for not just linguistics, but psycholinguistics more generally, and so concepts contained in this volume should form the basis of many discussions in future speech perception studies.


Altenberg, E. (2005) The perception of word boundaries in a second language. Second Language Research. 21(4), pp. 325-358.

Celce-Murcia, M., Brinton, D., and Goodwin, J. (1996) Teaching Pronunciation: A Reference for Teachers of English to Speakers of Other Languages. New York, USA: Cambridge University Press.

Otake, T. (2006) Speech segmentation by Japanese listeners: its language-specificity and language universality. In M. Nakayama, and Y. Shirai (eds); P. Li (general editor). The Handbook of East Asian Psycholinguistics, Volume II: Japanese. New York, USA: Cambridge University Press.


Andrew Blyth is a doctoral student in the TESOL department at the University of Canberra, Australia. His main interests are teaching and researching listening and pronunciation for English language teaching. He currently teaches English as a foreign language at various universities in central Japan.

Page Updated: 30-Aug-2010