LINGUIST List 13.1332

Mon May 13 2002

Review: Phonetics/Phonology: Hume & Johnson (2001)

Editor for this issue: Naomi Ogasawara <>

What follows is another discussion note contributed to our Book Discussion Forum. We expect these discussions to be informal and interactive; and the author of the book discussed is cordially invited to join in. If you are interested in leading a book discussion, look for books announced on LINGUIST as "available for discussion." (This means that the publisher has sent us a review copy.) Then contact Simin Karimi at or Terry Langendoen at


  1. Ewa Jacewicz, Hume & Johnson, ed. (2001) Role of Speech Perception in Phonology

Message 1: Hume & Johnson, ed. (2001) Role of Speech Perception in Phonology

Date: Sat, 11 May 2002 12:49:10 +0000
From: Ewa Jacewicz <>
Subject: Hume & Johnson, ed. (2001) Role of Speech Perception in Phonology

Hume Elizabeth, and Keith Johnson, ed. (2001)
The Role of Speech Perception in Phonology.
Academic Press, xviii+282pp, hardback ISBN 0-12-361351-5
Book Announcement on Linguist:

Ewa Jacewicz, Speech and Hearing Science, The Ohio State University


This volume contains an edited collection of ten papers originally
presented at the satellite meeting "The Role of Perceptual Phenomena
in Phonology" held in conjunction with the XIVth International
Congress of Phonetic Sciences in San Francisco, CA, in August
1999. Elizabeth Hume and Keith Johnson, the organizers of the meeting
and the editors of the present volume, have undertaken an important
contribution to the growing interest of phonologists in exploration
and integration of perceptual insights in describing the sound
patterns of language. Recent developments in speech technology, speech
perception research, and phonological theory open the way to integrate
the perceptual component of spoken language into its formal model. In
his foreword to the book, Bjorn Lindblom emphasizes that "this
movement now seems to be gaining momentum on the international scene"
(p. vii). Comparing the book to a meeting of the minds, Lindblom
invites the audience to be open-minded about the explanatory scope
of external vs. formal factors discussed throughout the chapters.

In Chapter one, "A model of the interplay of speech perception and
phonology" by Hume and Johnson, three questions are considered. First,
to what extent does speech perception influence phonological systems?
Second, to what extent does phonological structure influence speech
perception? Third, where do speech perception phenomena belong in
relation to a formal description of the sound structure of language?
Particular attention is devoted to the influence of speech perception
on phonological systems, exploring the fact that contrasts of weak
perceptibility tend to be avoided in language. The avoidance is
accomplished by either enhancing (or optimizing) the contrast through
repair strategies such as epenthesis, metathesis, and dissimilation,
or sacrificing it through assimilation and deletion. The perceptual
component is situated in a general model of the interplay of external
forces and phonology. This broadly defined model includes the
cognitive and formal representations of phonological systems as
well as four external factors (or filters): two lower-level effects
(perception and production) and two higher-level effects
(generalization and conformity) involving linguistic cognition and
social influence. The implementation of the model is discussed.

Chapter two, "The interplay of phonology and perception considered
from the perspective of perceptual organization" by Robert R. Remez
addresses issues in unimodal and multimodal organization of speech and
the interchange of phonology and perception. Arguing for multimodal
organization, Remez points out that "the detailed sensory constituents
of speech responsible for psychoacoustic impressions seem to matter
far less in phonetic perception" (p. 29) and that phonetic attributes
are symbolic objects. Experimental evidence is examined with reference
to psychoacoustics, especially to the general model of auditory
organization known as Auditory Scene Analysis (Bregman, 1990). The
experiments described in this chapter require basic familiarity with
psychoacoustic research to appreciate the arguments based on sine wave
replicas and quantized noise-band signals. In simple terms, the
physically distorted signal is perceived as a phonetic signal because
it satisfies an abstract description of the sensory variation
produced by human vocal tract. The interplay of phonology and
perception is governed by phonological organization which affects
perceptual response. A powerful conclusion is that "it is unlikely
that phonological contrasts require a perceiver to detect subtle
details of sensation close to the limits of auditory or visual sensory
resolution" (p. 48).

Chapter three, "Patterns of perceptual compensation and their
phonological consequences" by Patrice S. Beddor, Rena A. Krakow, and
Stephanie Lindemann, presents an experimental investigation of
perception of coarticulatory variation and its influence on
phonological systems. Two types of coarticulation are examined:
vowel-to-vowel coarticulation (with its consequences for vowel
harmony) and nasal coarticulation (exhibiting influence on distinctive
vowel nasalization and nasal harmony). The experiments reported in the
chapter were designed to test listeners' sensitivity to the presence
or absence of coarticulatory information. To do so, vowels were placed
in inappropriate and appropriate contexts. In their responses,
listeners chose the coarticulatorily appropriate (compensatory)
contexts more often, suggesting their partial compensation for
coarticulatory effects. In authors' view, "it is the listener who
mediates which aspects of coarticulatory variation become part of
phonology, and an important component of this mediation process is
compensation" (p. 73). More details about the experiments reported in
the chapter can be found in Beddor et al. (2000) and Beddor & Krakow

Chapter four, "Markedness and consonant confusion asymmetries" by
Steve S. Chang, Madelaine C. Plauche, and John J. Ohala is an
investigation of consonant confusion asymmetries that parallel
historical sound changes. The results from perception experiments
provide an argument for the primacy of perceptual cues over markedness
factors in accounting for the confusion asymmetry [ki]>[ti]. Given
that, according to the markedness hypothesis, velar stops are more
marked than alveolar stops in all vocalic environments, the [k]>[t]
confusion in the environment of a high front vowel is hard to
explain. The perceptual account gives a clear explanation, showing
that the confusion occurs only in a particular environment and can be
induced under laboratory conditions by manipulating the structure of
experimental stimuli. An additional experiment simulating the actual
diachronic sound change from velar stop to an affricate confirms that
perceptual cues reflecting the acoustic-auditory paths of sound
transmission play the primary role in the process.

Chapter five, "Effects of vowel context on consonant place
identification" by Jennifer Cole and Khalil Iskarous centers on the
question of which patterns in the phonetics may originate in
constraints in phonology. Perceptual ease or difficulty is invoked to
examine the effects of the adjacent vowel context on the perception of
consonantal place of articulation (C-Place). The authors present a
perception experiment investigating phonological C-Place and V-Place
dependencies for stops [b], [d], [g] which were embedded either in
front or back vowels. The stimuli were presented in clear speech
condition and in noise. The results show the strongest effect of
vocalic context on the identification of all three consonants when the
preceding and following vowels are front. As the authors themselves
point out, "the special status of this 'two-sided' context is not
observed in phonological processes that restrict C-Place features,
such as palatalization" (p. 118). Mixed results were also obtained
for the other vocalic contexts, suggesting weak correlations between
perceptual patterns and contexts for phonological constraints on
C-Place. These findings provide evidence that the mapping from
perceptual constraints in phonetics to phonological constraints is not

Chapter six, "Adaptive design of sound systems: Some auditory
considerations" by Randy L. Diehl, Michelle R. Mollis, and Wendy
A. Castleman is an illustration of how the need for auditory
distinctiveness may shape the structure of sound systems across
language communities. Experimental evidence is provided to support
three generalizations: 1) phonological contrasts are implemented by
auditory enhancement, 2) phonological contrasts are implemented as
qualitative distinctions, and 3) phonological category boundaries can
be expressed in auditorily simple terms. The strategy of auditory
enhancement, proposed by the Auditory Enhancement Hypothesis (Diehl &
Kluender, 1989; Kingston & Diehl, 1994) is demonstrated by the
tendency to select properties of speech sounds that reinforce
phonological contrasts. Examples discussed are lowering of F2 in the
vowel /u/ and the ratio of silence interval to frication duration for
affricates and fricatives. The significance of obtained linearity of
category boundaries for vowels is discussed in relation to auditory
representation of vowels and theories of vowel categorization.

Chapter seven, "The limits of phonetic determinism in phonology: *NC
revisited" by Larry M. Hyman considers the perennial question of the
relation between phonetics and phonology. As a second goal, it
reconsiders phonological *NT constraint (originally, *NC) proposed by
Pater (1996), suggesting a perceptual rather than articulatory
motivation for it. Hyman views phonology as the intersection of
phonetics and grammar. The process of phonologization proceeds through
the following stages: "universal phonetics determines in large part
what will become a language- specific phonetic property, which
ultimately can be phonologized to become a structured, rule-governed
part of the grammar" (p. 153). A thorough analysis of nasal+obstruent
(N+C) interactions follows with particular reference to Bantu
languages, examining postnasal voicing (motivating the constraint *NT)
and postnasal devoicing (motivating the proposed constraint
*ND). Examples from other languages and other postnasal processes
are also included. It is proposed that synchronic phonology must
recognize both *NT and *ND, of which *NT is perceptually preferred.

Chapter eight, "Contrast dispersion and Russian palatalization" by
Jaye Padgett seeks explanations for allophonic processes from the
functional perspective. The principle that sounds in contrast should
be sufficiently distinct is invoked to show "a deeper unity" of
phonemic and allophonic contrasts. The case study focuses on an
allophonic rule of Russian, by which the high front vowel /i/ is
centralized after non-palatalized consonants. Acoustic data for the
two vowel variants in two consonantal contexts in Russian and Irish
are presented. The effective second formant (or "F2 prime") was
calculated at consonantal release (either /b/ or /d/) and at the peak
F2 for each vowel using the formula by Carlson et al. (1970). "F2
prime" means were then compared across vowels and speakers. Using this
metric, the results show greater difference in "F2 prime" means
between the vowels at consonantal release than at the peak F2. This
serves as an argument that non- palatalized consonants before vowel
/i/ are velarized. Perceptual data needed to test this contrast are
not provided.

Chapter nine, "Directional asymmetries in place assimilation: A
perceptual account" by Donca Steriade is an investigation of the
interplay between perceived similarity and phonological
patterns. Central to Steriade's account is the concept of perceived
similarity, in which "perceptual factors - among them cue distribution
- play a critical role in defining degrees of similarity between
lexical forms and their conceivable modifications" (p. 222). A
perceptually based analysis of asymmetries in place assimilation
presented in the chapter aims to show that there exists a link between
contrast-specific perceptual cue distribution, assimilatory direction,
and rates of assimilation. These three are guided by the proposed
P-map, i.e., speakers' knowledge about positional saliency of the
contrast, which refers only to a perceived difference between two
strings disregarding their phonemic status. The complexity of the
chapter and depth of arguments and analyses deserve more space than
this brief review can offer.

Chapter ten, "Perceptual cues in contrast maintenance" by Richard
Wright explores the robustness of perceptual cues and their relevance
to phonological contrasts and processes. Two claims about syllabic
organization are tested experimentally: (1) formant transitions in the
onset provide cues that are more robust than formant transitions in
the coda, and (2) periodicity and transience affect the robustness of
cues in noise. Both hypotheses were confirmed, thus validating the
prominent role of onset relative to coda and phonetic motivation for
constraints on segmental ordering along with sonority violations. This
chapter draws attention to experimental design suitable for
investigation of phonological facts under ordinary listening
conditions, taking into account both signal degradation and
differences in robustness of perceptual cues.


The variety of perspectives on the place of speech perception in
phonology presented in this volume is impressive. It IS a meeting of
the minds, as Lindblom introduced it, a forum where the difficult
topic of speech perception in phonology calls for cooperation of
phonologists and phoneticians. This volume shows that such cooperation
is not only possible but also enjoyable, as both groups of researches
realize the need for each other's insights to discuss and understand
the perceptual component of the organization of speech and language.

In this volume, attempts have been made to integrate insights from
research on speech perception into formal (particularly
constraint-based) models of phonology. Such integration is proposed in
Hume & Johnson's model, in Steriade's account and, to an extent, in
Remez's multimodal organization of speech. The book also abounds in
experimental perception data that explore the perceptual basis of
phonological systems. The experiments by Beddor et al., Chang et al.,
Cole & Iskarous, Diehl et al., and Wright use a variety of perceptual
approaches to provide arguments for perceptual motivation of selected
phonological processes. Finally, the chapters by Hyman and Padgett
appeal to the role of perception in phonology indirectly, through
arguments in the phonological analysis (Hyman) and phonetic
investigation of acoustic patterns (Padgett). The book offers the best
selection of papers about the need for and the place of perceptual
phenomena in phonology that can be currently offered.

The interest in the perceptual component of language calls for
methodology that would test phonological patterns in laboratory
conditions. The paradigms developed to explore perceptual
characteristics of isolated vowels or consonants in CV/CVC syllables
with steady phonetic context may not always be of relevance to examine
phonological organization of language. The need for such methodology
is evident from the experiments presented throughout the book. As
Wright points out, "as phonologists turn to the perceptual literature
in search of the underpinnings of perceptually motivated constraints,
they find volumes of perceptual research on speech cues but few
experiments designed with phonology and phonological phenomena in
mind" (p. 252). Close cooperation of phonologists and phoneticians
trained in conducting speech perception research is thus a necessary
solution for the advancement of the field.

A word of caution about the use of perception literature is now in
order. It is true that certain perceptual phenomena have not been
systematically explored since early work in speech perception. As an
example, the effective second formant in vowels (or "F2 prime") was
isolated by Carlson et al. (1970) in a matching paradigm. In matching,
listeners were asked to "make the two sounds [the reference signal and
the two-formant target] as similar as possible", (1970:24) without
appealing to their mental (or memory) representation for each vowel
category tested. The "F2 prime" equation has been widely used since
then. Following that tradition, Padgett applied the equation to
acoustic data (Chapter nine) and obtained a clear separation between
"F2 prime" means for consonantal release and at the peak F2. This may
or may not hold for the perception of these cues, however. Using a
double staircase adaptive procedure (Jesteadt, 1980), Jacewicz & Feth
(2002) show that listeners' matching of the reference and target
signals is different from values predicted by the "F2 prime" formula
for high front vowels in English. This implies a need for caution in
making generalizations about perceptual characteristics of speech
segments and applying them without careful experimental exploration.

The problem of methodology naturally leads to the discernment in the
use of important concepts in the "perceptual phonology", such as
perceptual salience, cues, and perceived similarity, which are
repeatedly referred to throughout the chapters. Although the notion of
salience is rather intuitive, the perceptual cues that contribute to
salience are highly variable under ordinary listening conditions. They
are subject to degradation coming not only from the degree of signal
distortion but also from temporal factors that tend to reduce time for
processing of natural speech. Assumptions about salience based on
spectrographic displays of speech signals may therefore not always
prove to be true. The integration of cues in a single percept depends
on both peripheral and central auditory processing and may vary in
specific communicative situations. This implies that generalizations
about perceived similarity or salience of particular segmental
properties are hard to formulate without empirical testing.

The role of speech perception in phonology awaits further exploration
and formal representation in relation to sound patterns of
language. This book is an initial and important step in this
direction. It is highly recommended as a valuable contribution to a
library of every phonologist, phonetician, and speech scientist.


Beddor, P. S. and Krakow, R. A. (1999). Perception of coarticulatory
nasalization by speakers of English and Thai: Evidence for partial
compensation. Journal of the Acoustical Society of America, 106,

Beddor, P. S., Harnsberger, J., and Lindemann,
S. (2000). Language-specific patterns of vowel-to-vowel
coarticulation: Acoustic structures and their perceptual
correlates. Unpublished manuscript.

Bregman, A. S.(1990). Auditory scene analysis. Cambridge: MIT Press.

Carlson, R., Granstrom, B. and Fant, G. (1970). Some studies
concerning perception of isolated vowels. Speech Transmission
Laboratory Quarterly Progress Status Report (STL-QPSR 2/3), Stockholm:

Diehl, R. L. and Kluender, K. R. (1989). On the objects of speech
perception. Ecological Psychology, 1, 121-144.

Jacewicz, E. and Feth, L. L. (2002). Center-of-gravity effects in the
perception of high front vowels. To be presented at the 143rd
Acoustical Society of America Meeting, Pittsburgh, PA, June 06.

Jesteadt, W. (1980). An adaptive procedure for subjective
judgments. Perception & Psychophysics, 28, 85-88.

Kingston, J. and Diehl R. L. (1994). Phonetic knowledge. Language, 70, 419-454.

Pater, J. (1996). *NC. Proceedings of the North East Linguistic
Society, 26, 227-239. 


Ewa Jacewicz is a postdoctoral research fellow in Speech and Hearing
Science at The Ohio State University. Her research interests include
speech perception (particularly vowel perception), acoustics of
speech, psychoacoustics, laboratory phonology, phonology-phonetics
interactions, and articulatory phonetics. She is currently exploring
issues in vowel intensity and center-of-gravity effects in static and
dynamic signals.
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue