Review of The Role of Speech Perception in Phonology

Reviewer: Ewa Jacewicz
Book Title: The Role of Speech Perception in Phonology
Book Author: Elizabeth V. Hume Keith Johnson
Publisher: Academic Press
Linguistic Field(s): Phonology
Book Announcement: 13.1332

Discuss this Review
Help on Posting

Date: Fri, 10 May 2002 08:26:55 -0700
From: Ewa Jacewicz <>
Subject: Hume & Johnson, ed. (2001) Role of Speech Perception in Phonology

Hume, Elizabeth, and Keith Johnson, ed. (2001) The Role of Speech Perception in Phonology. Academic Press, xviii+282pp, hardback ISBN 0-12-361351-5

Ewa Jacewicz, Speech and Hearing Science, The Ohio State University

This volume contains an edited collection of ten papers originally presented at the satellite meeting "The Role of Perceptual Phenomena in Phonology" held in conjunction with the XIVth International Congress of Phonetic Sciences in San Francisco, CA, in August 1999. Elizabeth Hume and Keith Johnson, the organizers of the meeting and the editors of the present volume, have undertaken an important contribution to the growing interest of phonologists in exploration and integration of perceptual insights in describing the sound patterns of language. Recent developments in speech technology, speech perception research, and phonological theory open the way to integrate the perceptual component of spoken language into its formal model. In his foreword to the book, Bjorn Lindblom emphasizes that "this movement now seems to be gaining momentum on the international scene" (p. vii). Comparing the book to a meeting of the minds, Lindblom invites the audience to be open-minded about the explanatory scope of external vs. formal factors discussed throughout the chapters.

In Chapter one, "A model of the interplay of speech perception and phonology" by Hume and Johnson, three questions are considered. First, to what extent does speech perception influence phonological systems? Second, to what extent does phonological structure influence speech perception? Third, where do speech perception phenomena belong in relation to a formal description of the sound structure of language? Particular attention is devoted to the influence of speech perception on phonological systems, exploring the fact that contrasts of weak perceptibility tend to be avoided in language. The avoidance is accomplished by either enhancing (or optimizing) the contrast through repair strategies such as epenthesis, metathesis, and dissimilation, or sacrificing it through assimilation and deletion. The perceptual component is situated in a general model of the interplay of external forces and phonology. This broadly defined model includes the cognitive and formal representations of phonological systems as well as four external factors (or filters): two lower-level effects (perception and production) and two higher-level effects (generalization and conformity) involving linguistic cognition and social influence. The implementation of the model is discussed.

Chapter two, "The interplay of phonology and perception considered from the perspective of perceptual organization" by Robert R. Remez addresses issues in unimodal and multimodal organization of speech and the interchange of phonology and perception. Arguing for multimodal organization, Remez points out that "the detailed sensory constituents of speech responsible for psychoacoustic impressions seem to matter far less in phonetic perception" (p. 29) and that phonetic attributes are symbolic objects. Experimental evidence is examined with reference to psychoacoustics, especially to the general model of auditory organization known as Auditory Scene Analysis (Bregman, 1990). The experiments described in this chapter require basic familiarity with psychoacoustic research to appreciate the arguments based on sine wave replicas and quantized noise-band signals. In simple terms, the physically distorted signal is perceived as a phonetic signal because it satisfies an abstract description of the sensory variation produced by human vocal tract. The interplay of phonology and perception is governed by phonological organization which affects perceptual response. A powerful conclusion is that "it is unlikely that phonological contrasts require a perceiver to detect subtle details of sensation close to the limits of auditory or visual sensory resolution" (p. 48).

Chapter three, "Patterns of perceptual compensation and their phonological consequences" by Patrice S. Beddor, Rena A. Krakow, and Stephanie Lindemann, presents an experimental investigation of perception of coarticulatory variation and its influence on phonological systems. Two types of coarticulation are examined: vowel-to-vowel coarticulation (with its consequences for vowel harmony) and nasal coarticulation (exhibiting influence on distinctive vowel nasalization and nasal harmony). The experiments reported in the chapter were designed to test listeners' sensitivity to the presence or absence of coarticulatory information. To do so, vowels were placed in inappropriate and appropriate contexts. In their responses, listeners chose the coarticulatorily appropriate (compensatory) contexts more often, suggesting their partial compensation for coarticulatory effects. In authors' view, "it is the listener who mediates which aspects of coarticulatory variation become part of phonology, and an important component of this mediation process is compensation" (p. 73). More details about the experiments reported in the chapter can be found in Beddor et al. (2000) and Beddor & Krakow (1999).

Chapter four, "Markedness and consonant confusion asymmetries" by Steve S. Chang, Madelaine C. Plauche, and John J. Ohala is an investigation of consonant confusion asymmetries that parallel historical sound changes. The results from perception experiments provide an argument for the primacy of perceptual cues over markedness factors in accounting for the confusion asymmetry [ki]>[ti]. Given that, according to the markedness hypothesis, velar stops are more marked than alveolar stops in all vocalic environments, the [k]>[t] confusion in the environment of a high front vowel is hard to explain. The perceptual account gives a clear explanation, showing that the confusion occurs only in a particular environment and can be induced under laboratory conditions by manipulating the structure of experimental stimuli. An additional experiment simulating the actual diachronic sound change from velar stop to an affricate confirms that perceptual cues reflecting the acoustic-auditory paths of sound transmission play the primary role in the process.

Chapter five, "Effects of vowel context on consonant place identification" by Jennifer Cole and Khalil Iskarous centers on the question of which patterns in the phonetics may originate in constraints in phonology. Perceptual ease or difficulty is invoked to examine the effects of the adjacent vowel context on the perception of consonantal place of articulation (C-Place). The authors present a perception experiment investigating phonological C-Place and V-Place dependencies for stops [b], [d], [g] which were embedded either in front or back vowels. The stimuli were presented in clear speech condition and in noise. The results show the strongest effect of vocalic context on the identification of all three consonants when the preceding and following vowels are front. As the authors themselves point out, "the special status of this 'two-sided' context is not observed in phonological processes that restrict C-Place features, such as palatalization" (p. 118). Mixed results were also obtained for the other vocalic contexts, suggesting weak correlations between perceptual patterns and contexts for phonological constraints on C-Place. These findings provide evidence that the mapping from perceptual constraints in phonetics to phonological constraints is not direct.

Chapter six, "Adaptive design of sound systems: Some auditory considerations" by Randy L. Diehl, Michelle R. Mollis, and Wendy A. Castleman is an illustration of how the need for auditory distinctiveness may shape the structure of sound systems across language communities. Experimental evidence is provided to support three generalizations: 1) phonological contrasts are implemented by auditory enhancement, 2) phonological contrasts are implemented as qualitative distinctions, and 3) phonological category boundaries can be expressed in auditorily simple terms. The strategy of auditory enhancement, proposed by the Auditory Enhancement Hypothesis (Diehl & Kluender, 1989; Kingston & Diehl, 1994) is demonstrated by the tendency to select properties of speech sounds that reinforce phonological contrasts. Examples discussed are lowering of F2 in the vowel /u/ and the ratio of silence interval to frication duration for affricates and fricatives. The significance of obtained linearity of category boundaries for vowels is discussed in relation to auditory representation of vowels and theories of vowel categorization.

Chapter seven, "The limits of phonetic determinism in phonology: *NC revisited" by Larry M. Hyman considers the perennial question of the relation between phonetics and phonology. As a second goal, it reconsiders phonological *NT constraint (originally, *NC) proposed by Pater (1996), suggesting a perceptual rather than articulatory motivation for it. Hyman views phonology as the intersection of phonetics and grammar. The process of phonologization proceeds through the following stages: "universal phonetics determines in large part what will become a language- specific phonetic property, which ultimately can be phonologized to become a structured, rule-governed part of the grammar" (p. 153). A thorough analysis of nasal+obstruent (N+C) interactions follows with particular reference to Bantu languages, examining postnasal voicing (motivating the constraint *NT) and postnasal devoicing (motivating the proposed constraint *ND). Examples from other languages and other postnasal processes are also included. It is proposed that synchronic phonology must recognize both *NT and *ND, of which *NT is perceptually preferred.

Chapter eight, "Contrast dispersion and Russian palatalization" by Jaye Padgett seeks explanations for allophonic processes from the functional perspective. The principle that sounds in contrast should be sufficiently distinct is invoked to show "a deeper unity" of phonemic and allophonic contrasts. The case study focuses on an allophonic rule of Russian, by which the high front vowel /i/ is centralized after non-palatalized consonants. Acoustic data for the two vowel variants in two consonantal contexts in Russian and Irish are presented. The effective second formant (or "F2 prime") was calculated at consonantal release (either /b/ or /d/) and at the peak F2 for each vowel using the formula by Carlson et al. (1970). "F2 prime" means were then compared across vowels and speakers. Using this metric, the results show greater difference in "F2 prime" means between the vowels at consonantal release than at the peak F2. This serves as an argument that non- palatalized consonants before vowel /i/ are velarized. Perceptual data needed to test this contrast are not provided.

Chapter nine, "Directional asymmetries in place assimilation: A perceptual account" by Donca Steriade is an investigation of the interplay between perceived similarity and phonological patterns. Central to Steriade's account is the concept of perceived similarity, in which "perceptual factors - among them cue distribution - play a critical role in defining degrees of similarity between lexical forms and their conceivable modifications" (p. 222). A perceptually based analysis of asymmetries in place assimilation presented in the chapter aims to show that there exists a link between contrast-specific perceptual cue distribution, assimilatory direction, and rates of assimilation. These three are guided by the proposed P-map, i.e., speakers' knowledge about positional saliency of the contrast, which refers only to a perceived difference between two strings disregarding their phonemic status. The complexity of the chapter and depth of arguments and analyses deserve more space than this brief review can offer.

Chapter ten, "Perceptual cues in contrast maintenance" by Richard Wright explores the robustness of perceptual cues and their relevance to phonological contrasts and processes. Two claims about syllabic organization are tested experimentally: (1) formant transitions in the onset provide cues that are more robust than formant transitions in the coda, and (2) periodicity and transience affect the robustness of cues in noise. Both hypotheses were confirmed, thus validating the prominent role of onset relative to coda and phonetic motivation for constraints on segmental ordering along with sonority violations. This chapter draws attention to experimental design suitable for investigation of phonological facts under ordinary listening conditions, taking into account both signal degradation and differences in robustness of perceptual cues.

The variety of perspectives on the place of speech perception in phonology presented in this volume is impressive. It IS a meeting of the minds, as Lindblom introduced it, a forum where the difficult topic of speech perception in phonology calls for cooperation of phonologists and phoneticians. This volume shows that such cooperation is not only possible but also enjoyable, as both groups of researches realize the need for each other's insights to discuss and understand the perceptual component of the organization of speech and language.

In this volume, attempts have been made to integrate insights from research on speech perception into formal (particularly constraint-based) models of phonology. Such integration is proposed in Hume & Johnson's model, in Steriade's account and, to an extent, in Remez's multimodal organization of speech. The book also abounds in experimental perception data that explore the perceptual basis of phonological systems. The experiments by Beddor et al., Chang et al., Cole & Iskarous, Diehl et al., and Wright use a variety of perceptual approaches to provide arguments for perceptual motivation of selected phonological processes. Finally, the chapters by Hyman and Padgett appeal to the role of perception in phonology indirectly, through arguments in the phonological analysis (Hyman) and phonetic investigation of acoustic patterns (Padgett). The book offers the best selection of papers about the need for and the place of perceptual phenomena in phonology that can be currently offered.

The interest in the perceptual component of language calls for methodology that would test phonological patterns in laboratory conditions. The paradigms developed to explore perceptual characteristics of isolated vowels or consonants in CV/CVC syllables with steady phonetic context may not always be of relevance to examine phonological organization of language. The need for such methodology is evident from the experiments presented throughout the book. As Wright points out, "as phonologists turn to the perceptual literature in search of the underpinnings of perceptually motivated constraints, they find volumes of perceptual research on speech cues but few experiments designed with phonology and phonological phenomena in mind" (p. 252). Close cooperation of phonologists and phoneticians trained in conducting speech perception research is thus a necessary solution for the advancement of the field.

A word of caution about the use of perception literature is now in order. It is true that certain perceptual phenomena have not been systematically explored since early work in speech perception. As an example, the effective second formant in vowels (or "F2 prime") was isolated by Carlson et al. (1970) in a matching paradigm. In matching, listeners were asked to "make the two sounds [the reference signal and the two-formant target] as similar as possible", (1970:24) without appealing to their mental (or memory) representation for each vowel category tested. The "F2 prime" equation has been widely used since then. Following that tradition, Padgett applied the equation to acoustic data (Chapter nine) and obtained a clear separation between "F2 prime" means for consonantal release and at the peak F2. This may or may not hold for the perception of these cues, however. Using a double staircase adaptive procedure (Jesteadt, 1980), Jacewicz & Feth (2002) show that listeners' matching of the reference and target signals is different from values predicted by the "F2 prime" formula for high front vowels in English. This implies a need for caution in making generalizations about perceptual characteristics of speech segments and applying them without careful experimental exploration.

The problem of methodology naturally leads to the discernment in the use of important concepts in the "perceptual phonology", such as perceptual salience, cues, and perceived similarity, which are repeatedly referred to throughout the chapters. Although the notion of salience is rather intuitive, the perceptual cues that contribute to salience are highly variable under ordinary listening conditions. They are subject to degradation coming not only from the degree of signal distortion but also from temporal factors that tend to reduce time for processing of natural speech. Assumptions about salience based on spectrographic displays of speech signals may therefore not always prove to be true. The integration of cues in a single percept depends on both peripheral and central auditory processing and may vary in specific communicative situations. This implies that generalizations about perceived similarity or salience of particular segmental properties are hard to formulate without empirical testing.

The role of speech perception in phonology awaits further exploration and formal representation in relation to sound patterns of language. This book is an initial and important step in this direction. It is highly recommended as a valuable contribution to a library of every phonologist, phonetician, and speech scientist.

Beddor, P. S. and Krakow, R. A. (1999).Perception of coarticulatory nasalization by speakers of English and Thai: Evidence for partial compensation. Journal of the Acoustical Society of America, 106, 2868-2887.

Beddor, P. S., Harnsberger, J., and Lindemann, S. (2000). Language-specific patterns of vowel-to-vowel coarticulation: Acoustic structures and their perceptual correlates. Unpublished manuscript.

Bregman, A. S.(1990). Auditory scene analysis. Cambridge: MIT Press.

Carlson, R., Granstrom, B. and Fant, G. (1970). Some studies concerning perception of isolated vowels. Speech Transmission Laboratory Quarterly Progress Status Report (STL-QPSR 2/3), Stockholm: 19-35.

Diehl, R. L. and Kluender, K. R. (1989). On the objects of speech perception. Ecological Psychology, 1, 121-144.

Jacewicz, E. and Feth, L. L. (2002). Center-of-gravity effects in the perception of high front vowels. To be presented at the 143rd Acoustical Society of America Meeting, Pittsburgh, PA, June 06.

Jesteadt, W. (1980). An adaptive procedure for subjective judgments. Perception & Psychophysics, 28, 85-88.

Kingston, J. and Diehl R. L. (1994). Phonetic knowledge. Language, 70, 419-454.

Pater, J. (1996). *NC. Proceedings of the North East Linguistic Society, 26, 227-239.

ABOUT THE REVIEWER Ewa Jacewicz is a postdoctoral research fellow in Speech and Hearing Science at The Ohio State University. Her research interests include speech perception (particularly vowel perception), acoustics of speech, psychoacoustics, laboratory phonology, phonology-phonetics interactions, and articulatory phonetics. She is currently exploring issues in vowel intensity and center-of-gravity effects in static and dynamic signals.

