Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Revitalizing Endangered Languages

Edited by Justyna Olko & Julia Sallabank

Revitalizing Endangered Languages "This guidebook provides ideas and strategies, as well as some background, to help with the effective revitalization of endangered languages. It covers a broad scope of themes including effective planning, benefits, wellbeing, economic aspects, attitudes and ideologies."

New from Wiley!


We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Review of  Phonetics and Phonology in Language Comprehension and Production

Reviewer: Lourdes Aguilar
Book Title: Phonetics and Phonology in Language Comprehension and Production
Book Author: Niels O. Schiller Antje S. Meyer
Publisher: De Gruyter Mouton
Linguistic Field(s): Phonetics
Issue Number: 15.1373

Discuss this Review
Help on Posting
Date: Fri, 30 Apr 2004 11:24:23 +0200
From: Lourdes Aguilar
Subject: Phonetics and Phonology in Language Comprehension and

EDITORS: Schiller, Niels O.; Meyer, Antje S.
TITLE: Phonetics and Phonology in Language Comprehension and Production
SUBTITLE: Differences and Similarities
SERIES: Phonology and Phonetics, 6
PUBLISHER: Mouton de Gruyter
YEAR: 2003

Lourdes Aguilar, Autonomous University of Barcelona


In the domain of linguistics, the relation between phonetic and
phonological representation has constituted a controversial issue since
the seminal works by Saussure, Baudouin de Courtenay or the Prague
Circle (Anderson, 1985). It is not until recently, that researchers
coming from the phonetics, with data of production, acoustics or
perception studies, have examined the relations between abstract and
concrete realizations of speech; and phonologists have tested their
models by designing experiments (Beckman (ed) 1990, and the series of
Papers in Laboratory Phonology: Kingston & Beckman, 1990; Docherty &
Ladd, 1992; Keating, 1994; Connell & Arvaniti, 1995, Broe &
Pierrehumbert, 2000; Local, Ogden & Temple, 2004). In the domain of
psycholinguistics, due to the obvious differences in processing, much
research has concentrated in either speech production or speech
comprehension without considering the other process in detail. The book
under review constitutes an attempt to fill this gap, paying a special
attention to the relation between the processes involved in speech.


The book opens with a clear introduction written by the editors, Niels
O. Schiller and Antije S. Meyer, which helps us to understand the main
scope of the book: to ask how exactly speech production and
comprehension are similar or different from each other. In the first
chapter ("Neighbors in the lexicon: Friends or foes"), Gary S. Dell and
Jean K. Gordon discuss the effects of the neighborhood density (the
number of words that are phonologically similar to the targeted word),
a property that appears to have opposite effects in speech production
and perception: high neighborhood density has detrimental effects in
comprehension tasks, but facilitatory effects in production tasks.
These findings are explained in terms of the two-step interactive-
activation model of lexical access (Dell et al, 1997), using a normal
version of the model, and versions designed to reflect aphasic lesions.
The authors argue that the interactive property of the model -that
activation feeds back from phonological units to lexical units during
production- allows for a target word's neighbors to increase the
probability with which that word is selected. Production and
comprehension differ in their response to neighborhood density because
production and comprehensions tasks create different environments. In
comprehension tasks, where phonological neighbors are serious
competitors, a densely neighborhood is a disadvantage to an accurate
retrieval. On the contrary, in production tasks, the set of competitors
is defined on semantic grounds, and as a consequence, neighborhood
density facilitates the retrieval of the target. In summary, whether
neighbors are competitive or cooperative depends on the task: they are
costly in recognition but beneficial in production..

The chapter "Continuity and gradedness in speech processing" (James Mc
Queen, Delphine Dahan and Anne Cutler) focuses on the mapping of the
speech signal onto stored lexical knowledge in speech comprehension,
and compares it to the mapping of stored representations onto
articulatory commands in speech production. Their proposal is that the
two processing systems are fine-tuned to the different task demands of
speech decoding and encoding. They argue that speech decoding is
continuous and graded: information flows through the recognition system
in cascade all the way up to the meaning level, with no discrete
processing stages. Furthermore, the authors review empirical evidence
for the assumptions of multiple activation and relative evaluation of
lexical candidates: the multiple candidate words compete with each
other, and their activation is modulated by the subphonemic detail in
the speech signal. The way that phonetic and phonological information
is processed in speech encoding appears to be very different. Lexical
access is a two-stage process, with, on some accounts (e.g. WEAVER++
production model), strict seriality, and on other accounts (e.g. DSMSG
model), limited cascade between levels. In no current production models
is there massive parallel activation of word forms. Furthermore, it
appears that subphonemic detail need not be specified; instead, this
type of detail could be filled in by post-lexical rules. As said
before, these differences may be explained by differences in the tasks
during speech production and comprehension: a system continuously
processing phonetic detail is perfect for speech comprehension, but
inefficient for production.

In the next chapter ("The internal structure of words: Consequences for
listening and speaking"), Pienie Zwitserlood discusses the
representation and processing of morphological information. The main
question is if linguistically motivated distinctions of morphological
classes translate to language processing, and where do morphemes play a
role during speaking and listening. She argues in favour of an
independent level of morphological level, distinct from syntactic,
semantic and phonological levels, and shared by speech production and
comprehension processes. There is evidence for that, even if word forms
are different: whereas lemmas are selected in production, word forms
are selected in comprehension, i.e., the word forms of speech
perception are larger than individual morphemes.

The issue examined in the chapter written by Ardi Roelofs ("Modeling
the relation between the production and recognition of spoken word
forms") is whether a single system participates in phonetic and
phonological processing during both production and recognition; or,
instead, there are separate phonetic and phonological systems for
production and recognition. The question is addressed within the
context of computationally implemented models of spoken word
recognition (TRACE, Shortlist) and production (DSMG, WEAVER++). A
shared system must have bidirectional links between sublexical and
lexical units: this would imply feedback from the sublexical to the
lexical level in production, and from the lexical to the sublexical
level in comprehension. To contrast this condition, Roelofs reviews
findings from a variety of sources (picture naming tasks, chronometric
tasks, neuroimaging studies) and concludes that there is no positive
evidence for feedback. Instead, the available evidence supports the
idea of separate but closely linked feed- forward systems for
comprehension and production.

In the chapter "Articulatory Phonology: A Phonology for public language
use", Louis Goldstein and Carol A. Fowler are faced with two goals: (a)
to develop a realistic understanding of language forms as language
users know them, produce them and perceive them, and (b) to understand
how the forms might have emerged in the evolutionary history of humans,
and how they arise developmentally, as child interacts with speakers in
the environment. Their seminal idea is that language forms are kinds of
public action, not exclusively mental categories. The relation between
speech production and speech perception is discussed from the point of
view of Articulatory Phonology. In Articulatory Phonology, vocal tract
activity is analyzed into constriction actions (gestures) of distinct
vocal organs, and constriction formation is appropriately modeled by
dynamical systems. The authors argue that the phonological knowledge
can be conceived as abstract constraints on gestural coordination, and
that the language forms that languages users know, produce and perceive
must be the same. >From the point of view of Articulatory Phonology,
gestures are the common phonological currency between producers and
listeners: gestures are preserved from language planning, via
production to perception, and directly structure the acoustic signal.
Listeners, therefore, perceive gestures, not acoustic cues. This claim
has not been accepted in the speech research community, but Goldman and
Fowler think that there is enough indirect evidence for it.

The chapter "Neural control of speech movements" by Frank H. Guenther,
addresses the representations required for producing the syllables that
make up a spoken utterance (including auditory, tactile, proprioceptive
and muscle command representations) and their interactions with
reference to a model of the neural processes involved in the production
of speech sounds. Guenther proposes a neural network model of speech
motor skill acquisition and speech production (DIVA). This model
provides a unified explanation for a wide range of data on articulator
kinematics and motor skill development that were previously addressed
individually, including functional brain imaging, psychophysical,
physiological, anatomical and acoustic data. One advantage of the
neural network approach is that it allows analyzing the brain regions
involved in speech in terms of a well-defined theoretical framework.
According to the model, speech perception and production are supported
by separate, but closely linked cortical areas.

Miranda van Turennout, Bernadette Schmitt and Peter Hagoort ("When
words come to mind: Electrophysiological insights on the time course of
speaking and understanding words") focus on testing one specific model
(WEAVER++) to demonstrate the value of electrophysiological data for
speech comprehension and production research. They describe how event-
related brain potentials (ERPs) can be used to determine:
(a) the time that is needed for the retrieval of distinct types of
lexical information, and
(b) the order in which semantic, syntactic and word form information is
retrieved in speaking, listening and reading.
The authors compare the empirical data with the time course estimates
derived from the WEAVER++ model of speech production. The ERP
comprehension data together with the production data provide support
for the tested model. The findings indicate that during speech
production semantic processing precedes syntactic processing and
phonological encoding. Furthermore, the results of ERP studies support
the assumption made by WEAVER++ that word form information is accessed
before syntactic and semantic information, i.e., the ordering of the
retrieval processes is reversed in speech comprehension.

The last two chapters are dedicated to discussing different approaches
to the issue of the acquisition and representation of phonetic and
phonological categories in bilingual speakers. Núria Sebastian-Gallés
and Judith F. Kroll ("Phonology in bilingual language processing:
Acquisition, perception and production") examine the organisation of
sound systems in bilingual infants and adults. A central research issue
is to what extent bilinguals possess one or two phonological systems.
The results of the production and perception studies reviewed by the
authors converge closely in showing that lexical access is nonselective
with regard to language. One interesting aspect of the comparison
between perception and production is the difference related to the part
of the lexicon that becomes activated. In the perception studies, it is
the phonology of lexical form relatives that is engaged: i.e., words
that sound like the target word are activated regardless of the
language from which they are drawn. In the production studies, it is
the phonology of the translation equivalent that is active in addition
to phonological neighbors of the utterance itself.

James E. Flege ("Assessing constraints on second-language segmental
production and perception") examines theory and evidence related to the
production and perception of phonetic segments by second language (L2)
learners and monolingual native speakers of the same language. He
presents the Speech Learning Model (SLM), developed by him and other
colleagues. This model focuses explicitly on L2 speech acquisition, and
aims to account for changes across the life span in the learning of
segmental production and perception. The SLM proposes that native vs.
non- native differences are more likely to arise as the result of
interference prior phonetic learning than from a loss of neural
plasticity: that is, adults retain the ability to form new phonetic
categories for speech sounds in L2, but phonetic category formation
becomes more difficult with increasing age because the phonetic systems
of the two languages are not fully separate. To support these
hypotheses, Flege provides empirical evidence from production and
perception studies of L2 vowel acquisition.


The book under review is not presented as a unified theory accounting
for speech production and comprehension; instead, contributors tackle
the central problem of the similarities and differences between speech
comprehension and production in different ways, employing contemporary
approaches such the use of neuroimaging studies, electrophysiological
data or computational models. Nevertheless, the book is cohesive due to
the effort made by the authors so as to furnish new data and
explanations concerning the processes involved in speech.

At the end of the reading, we can infer that there is agreement across
authors on at least, one important point: regardless of the different
communicative demands faced by listeners and speakers, nobody assumes
that there are entirely independent representations used exclusively in
production and comprehension. On the contrary, the different task
demands can account for the different patterns found in production and
comprehension research, as proved in the chapters written by Dell and
Gordon and by McQueen, Dahan and Cutler. Though for different reasons,
Zwitserlood and Roelofs argue that there must be separate phonological
and phonetic components for production and comprehension, while the
other levels, are likely to be shared.

The consideration of the units and levels implied in the models is also
well covered. For instance, Guenther proposes a neural network model
which captures the control of processes from the syllabic level to the
level of muscle commands whereas Roelofs' WEAVER++ model specifies
speech-planning process up to the syllabic level.

Interestingly, the outstanding methodological standard of all papers
raises the issue of whether the experimental design is likely to
constrain the nature of the conclusions. Related to this, van
Turennout, Schmitt and Hagoort present very similar experimental
paradigms that can be used to study word production and comprehension,
which is welcomed by all those interested in the investigation of
similarities and differences between speech production and

The book achieves a high-level scientific standard: the individual
chapters are written by authorities in their respective academic
subdisciplines, and topics are examined in depth, relating the
empirical evidence to computational or theoretical models. Furthermore,
the text structure facilitates the assimilation of the contents: all
the chapters begin with a brief summary, and end with a conclusion in
which the main arguments of the chapter are recapitulated. To complete
the data discussed in each chapter, a complete list of references
appears at the end. This relative independence makes possible a whole
reading of the book, or a single chapter reading.

As said before, a good point of the book is the inclusion of various
approaches, giving a comprehensive overview of the research in the
domain of speech production and comprehension. Nevertheless, this
advantage can be perceived as a caveat by the readers, since the
differences in treatment and the terminology used can make the reading
difficult. For this reason, a glossary with the concepts and terms used
in the book would be helpful.

To end up, the volume under review constitutes an important
contribution to the study of the representation of phonetic and
phonological knowledge in speech comprehension and production, and
their interface. It will be of interest to a wide range of researchers
in phonetics, phonology, psycholinguistics, cognitive science and the
study of speech lesions. With the tutorial and assistance of the
teacher, it is accessible to undergraduate students, while for
postgraduate students and speech researchers is an excellent state-of-
the-art of the existing works and a text full of suggestions for
additional research.


Anderson, Stephen R. (1985) Phonology in the Twentieth Century.
Theories of Rules and Theories of Representations, The University of
Chicago Press, Chicago and London.

Beckman, Mary E. (ed.) (1990): "Phonetic representation", Journal of
Phonetics, 18.

Broe, Michael B. & Janet B. Pierrehumbert (eds) (2000) Papers in
Laboratory Phonology V. Language Acquisition and the Lexicon,
Cambridge, Cambridge University Press.

Connell, Bruce & Amalia Arvaniti (eds) (1995) Papers in Laboratory
Phonology IV. Phonology and Phonetic Evidence, Cambridge, Cambridge
University Press.

Dell, Gary S., Myrna F. Schwartz, Nadine Martin, Eleanor M. Saffran and
Debra A. Gagnon (1997) "Lexical access in aphasic and nonaphasic
speakers", Psychological Review 93: 283-321.

Docherty, Gerard J. & D. Robert Ladd (eds) (1992) Papers in Laboratory
Phonology II. Gesture, Segment, Prosody, Cambridge, Cambridge
University Press.

Keating, P. A. (ed.) (1994) Papers in Laboratory Phonology III.
Phonological Structure and Phonetic Form, Cambridge, Cambridge
University Press.

Kingston, John & Mary E. Beckman (eds) (1990) Papers in Laboratory
Phonology I. Between the Grammar and Physics of Speech, Cambridge,
Cambridge University Press.

Local, John, Richard Ogden & Rosalind Temple (2004) Papers in
Laboratory Phonology VI. Phonetic interpretation, Cambridge, Cambridge
University Press.

Lourdes Aguilar is a lecturer of Spanish language and linguistics in
the Department of Hispanic Philology at the Autonomous University of
Barcelona. Her research interests include phonetics and phonology and
their interface, discourse structure, and speech and language