LINGUIST List 24.3159

Mon Aug 05 2013

Review: Language Acquisition; Psycholinguistics: Cutler (2012)

Editor for this issue: Rajiv Rao <>

Date: 01-Jul-2013
From: Robert Port <>
Subject: Native Listening
E-mail this message to a friend

Discuss this message

Book announced at

AUTHOR: Anne Cutler
TITLE: Native Listening
SUBTITLE: Language Experience and the Recognition of Spoken Words
YEAR: 2012

REVIEWER: Robert F. Port, Indiana University Bloomington


‘Native Listening’ offers a thoughtful review of a large body of the psychological literature relevant to understanding what knowledge and skills speakers apply when they listen to the language they know best. Dr. Cutler has devoted her career to many of the issues relating to this problem. The book reflects her broad knowledge and experience in this interdisciplinary field that includes language development in children, linguistics, psychology of memory, second language studies and the perceptual processing of speech. Her primary conclusion is implicit in her title, that all of us native speakers acquire a specialized system for interpreting the language we have grown up with that is directly analogous to the richly detailed knowledge and skills of ``native speakers’’ in producing their own language. What native listeners hear when their language is spoken is vastly different from what non-natives hear listening to another language. Native listeners are finely tuned to the detailed speech habits and the statistical distributions of the speech community they belong to. Although this might sound uncontroversial to some readers, this conclusion is at odds with the assumptions of many linguists (e.g. Chomsky and Halle, 1968 and most generative linguists), i.e., that all children come to language learning with a uniform (and universal) phonetic alphabet for the mental representation of speech. According to this view, we all hear the phonetics of every language in essentially the same way, so the notion of “native listening’’ hardly makes any sense.

The writing in ‘Native Listening’ is nearly always clear and, just as the reader might be thinking “So what am I supposed to take from this section?,’’ the section closes with a helpful summary of its main points. In addition, each chapter also closes with a review of its main argument.

The introductory chapter, on “listening and native language,” reviews basic articulatory phonetics and the basics of linguistic speech systems. However, there is almost nothing on human hearing. The second chapter is about what spoken language is like, touching upon topics such as categorical perception and within- and cross-word ‘embeddings’ (of one word within another), since the pronunciation of any simple phrase like “we start” contains acoustic patterns that match other words such as “we, wee, Weese, east, star, Dar, dart,” etc. The next two chapters look at how words are recognized, given very difficult problems such as embedding. It reviews phonetic cues to word boundaries and various experimental methods like ‘lexical decision,’ ‘eye tracking,’ and various kinds of ‘priming.’ Chapter 5 explores evidence for the “possible word constraint’’ (i.e. the phonological patterns of word shape that are characteristic of each language) and its role in making word segmentation more reliable despite the widespread occurrence of embedding. The following chapter discusses the fine structure of speech, including both subtle dialect variation and individual pronunciation differences. Chapter 7 discusses prosody, especially stress and pitch accent and their role in native speech understanding. The discussion also includes surprising cases where native listeners seem to ignore aspects of the prosody that they produce. Chapter 8 provides an overview of speech acquisition in children and the great importance of statistics in the acquisition of native listening skills. Chapters 9 and 10 deal with the acquisition of a second language as an adult and evaluate such problems as the difficulties experienced by second-language speakers listening in noisy situations. It also reviews data on bilinguals who appear to speak two languages with equal skill, with one conclusion being that there is nearly always some detectible asymmetry between the languages of a bilingual. Chapter 11, on the plasticity of adult speech perception, offers an extensive review of pronunciation changes in adults due to ongoing change in the ambient language or due to contact with another language. The final chapter is a 40 page overview of Cutler’s final views on how people recognize speech in context by exploiting the many aspects of their “exquisitely tailored’’ skills in listening to their native language. She successfully makes her case for the importance of “native listening,” which helps us understand how we are able to follow speech as well as we all do. There are no simple tricks or methods for “universal feature” extraction that account for the data published over the past 50 years. Overall, this concluding chapter pulls everything together and effectively presents the author’s summary about speech perception and acquisition.


One challenge for the author and publisher of this book was how to indicate the exact pronunciation of English and Dutch test words and pseudo-words. Although there is a 4-page appendix presenting the basic International Phonetic Association (IPA) alphabet (the standard tool for this purpose), there was almost no use of the IPA alphabet in the volume. Instead, the author relies primarily on English or Dutch orthographic representations for test words and pseudo-orthographic spellings for non-words. But the orthographic spellings of English and Dutch are generally ambiguous about pronunciation details, especially if one has weak knowledge of how to pronounce Dutch orthography (as this reviewer does). Many uncertainties resulted. To give an example (p. 87), in one experiment the author employs English words like “mask” and nonsense words like “maskek” as stimuli. This begs the question of how the pseudo-orthographic form “maskek” is pronounced. Is the second syllable pronounced like “cake” (with aspirated [k] and a tense vowel) or like “skeck” (without aspiration and a lax vowel)? A representation in IPA would have made that explicit, even though it would demand a little more of the typesetter and the reader. These confusions arose multiple times in almost every chapter of the book. In fact, given the number of Dutch words and phrases, an additional appendix introducing readers to the pronunciation of Dutch orthography would have been welcome.

One theoretical issue puzzled me. The author makes a persuasive case for the richness of the skills of the native listener (and speaker) and for the necessity of both an abstract memory representation of words and also storage of many alternative pronunciations and speaker-specific variants (e.g. p. 423). Among the evidence for detailed, exemplar-like memory (see Pierrehumbert, 2001) is that recognition memory (where the participant’s task is to decide for each item in a list whether or not it occurred earlier in the list) for aurally presented words shows that repetition of a word in the same voice improves accuracy relative to repetition of a word spoken by a different voice (Palmeri et al., 1993). This shows that speakers are not relying on an abstract phonological memory, but rather a concrete, detailed memory that includes information about speakers’ voices. Other evidence is a demonstration that listeners can recognize speech better in noise when they are familiar with the speaker’s voice than when they are not (Nygaard, Sommers and Pisoni, 1994). In addition to evidence that speakers store fairly detailed records of speech that they hear, Cutler also makes a clear case for some degree of abstraction and generalization in memory representations, and therefore, for the insufficiency of a pure exemplar memory that stores only detailed records of heard utterances. For example, she cites evidence that listeners can be taught that one speaker’s idiosyncratic productions of a lisping fricative that is acoustically midway between /f/ and /s/ should be interpreted as an /s/. One way to do this is to present participants with a short story containing many /s/s, all of which are replaced with the ambiguous fricative (and, of course, for another listener group, all the /f/s are replaced with the ambiguous fricative). Of course, each subject group uses context in the story to infer what words the speaker intends (Norris, McQueen and Cutler, 2003; Eisner and McQueen, 2006). But now if this ambiguous lispy-sounding /s/ is inserted in the productions of words in the voice of a different speaker, listeners do not show the adaptation that they learned (Eisner and McQueen, 2005). The generalization of the odd pronunciation to new words, while not making the generalization to a new voice, is evidence that listeners are able to adapt to speaker idiosyncracies and to project a pronunciation from heard words to new words.

Elsewhere, Cutler has used the cautious term “prelexical abstraction” to refer, apparently, to some kind of abstract patterns smaller than words that are sufficiently identical from word to word to allow generalization of an acoustic pattern from a few words to other, unheard words (Cutler et al., 2006). She may be roughly correct about “prelexical” (or “sublexical’’) abstractions of some sort, but using the word “phoneme” for them, as she does in the book under review, may cause confusion. Phoneme implies much more. There are many further assumptions about phonemes or phonological segments made by linguists. Most approaches to phonology assume that phonemes are completely context-free (except as constrained by formal rules) and serially ordered (with no internal temporal structure or temporal overlap) and, crucially, are very limited in number for any language (believed to be fewer than about 50 in most cases). Then, these letter-like objects are employed so as to provide a single, unique memory representation for each lexical item in a language (Chomsky and Halle, 1968; see Port and Leary, 2005; Port, 2011). However, no research has been conducted by anyone to verify most of these assumptions. Cutler’s linguistic memory representations, which include some kind of abstract representation as well as speakers’ “extensive records of the specifics of their speech processing experience” (p. 421), imply a model of phonological memory that differs greatly from the ideas underlying traditional linguistics. Of course, she is a psycholinguist, not a formal phonologist, but the results she cites have implications that seriously undermine some major tenets of linguistic theory. There is no problem with her informal use of the term “phoneme” to point out some aspects of this complex representational system, but linguists should beware that she is proposing memory structures that are quite incompatible with what linguists generally assume when they propose formal phonological rules or formal constraints.

Leaving that issue aside, Cutler has produced a clearly written exposition of the state of understanding of the enormous challenges faced by native listeners and the spectacular ease they exhibit in understanding speech as well as they do. As one of the leaders in the field, she focuses, quite naturally, on the work done in her very productive laboratory over the past third of a century, at roughly the time when she herself has moved on to a new laboratory in Australia. This book deserves close attention by all who are interested in the psychological representation of language and the development of human speaking and listening skills.


Chomsky, N. and Halle, M. (1968). The sound pattern of English. Harper-Row, New York.

Eisner, F. and McQueen, J. M. (2005). The specificity of perceptual learning in speech processing. Perception and Psychophysics 67, 224-238.

Eisner, F. and McQueen, J. M. (2006). Perceptual learning in speech: Stability over time. Journal of the Acoustical Society of America, 119, 1950-1953.

Norris, D., McQueen, J. M. and Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology 47, 204-238.

Nygaard, L. C., Sommers, M. S. & Pisoni, D. B. (1994) Speech perception as a talker contingent process. Psychological Science 56, 42-46.

Palmeri, T. J., Goldinger, S. D. & Pisoni, D. B. (1993) Episodic encoding of voice attributes and recognition memory for spoken words. J. Experimental Psychology: Learning, Memory and Cognition 19, 309-328.

Pierrehumbert, J. (2001) Exemplar dynamics: Word frequency, lenition and contrast. In J. Bybee & P. Hopper (Eds) Frequency effects and the emergence of lexical structure. (pp. 137-157) Amsterdam: John Benjamin.

Port, R. (2011) Language as a social institution: Why phonemes and words do not have explicit psychological form. Ecological Psychology 22, 304-326.

Port, R. and Leary, A. (2005) Against formal phonology. Language 81, 927-964.


Robert Port is Emeritus Professor of Linguistics and Cognitive Science at Indiana University. His research has focused on phonetics, especially on issues related to speech timing and the dynamics of speech production and perception. He has conducted research especially on the phonetics of English, Japanese and German. In recent years he has emphasized the implausibility of a universal phonetic alphabet and the necessity of representing language in memory in a much richer form than is customary in linguistics.

Page Updated: 05-Aug-2013