"Kissine offers a new theory of speech acts which is philosophically sophisticated and builds on work in cognitive science, formal semantics, and linguistic typology. This highly readable, brilliant essay is a major contribution to the field."
Date: Fri, 26 Nov 2004 13:03:28 +0800 From: David Deterding <dhdeter@nie.edu.sg> Subject: Phonetic Interpretation: Papers in Laboratory Phonology VI
EDITORS: Local, John; Ogden, Richard; Temple, Rosalind TITLE: Phonetic Interpretation SUBTITLE: Papers in Laboratory Phonology VI PUBLISHER: Cambridge University Press YEAR: 2003
David Deterding, NIE/NTU, Singapore
OVERVIEW
This book is a selection of papers from the Laboratory Phonology VI conference held in York in 1998. Though it has taken over five years to come out (and so it has in fact emerged later than the comparable volume for the 2000 Laboratory Phonology conference, Gussenhoven & Warner 2002), it still represents a valuable and compact overview of recent work by a number of prominent scholars from around the world into the nature of phonological representation and its connection with phonetic realisation.
As is the norm for research under the rubric of Laboratory Phonology (e.g. Connell & Arvaniti 1995, Broe & Pierrehumbert 2000), the focus is on the fine details of articulation and perception and how research into these can provide evidence about the nature of phonological representation, so most of the papers report on meticulous measurements of data from a small number of speakers producing carefully prepared material under laboratory conditions, and there is almost no discussion of naturally occurring speech.
After an introduction by the editors, the book is divided into four Parts, each with four Chapters followed by a critical commentary on the papers in that Part, though the first paper in each of Parts I, II and III does not undergo critical evaluation in this way. (These three papers are the longest, with an average of 25.7 pages compared with 16.2 pages for the others, so one assumes they represent the keynote presentations from the conference.) Only in Part IV are all four preceding papers discussed in the critical commentary.
SYNOPSIS
Part I is on phonological representation in the lexicon.
In Chapter 1, Mary Beckman and Janet Pierrehumbert report on a priming experiment in which subjects gave the first word that came to mind in response to pairs of words either with similar meaning or shared phonemes, and they conclude that the results support a model in which the semantic and phonological entries for a word are stored in separate but connected compartments of the lexicon. Furthermore, they argue that an abstract phonemic representation for lexical items assists infants in negotiating the bottlenecks in language acquisition.
In Chapter 2, Sarah Hawkins and Noel Nguyen build on previous work showing that the quality of an initial /l/ is affected by the voicing of the coda, and they predict a longer reaction time (RT) for cross-spliced tokens. Although the expected effect was not found, for real words they did find a correlation between RT and the size of the acoustic mismatch after cross- splicing. This was especially true for a voiceless coda, probably because the longer vowel for a voiced coda allows the relatively weak perceptual cues in the initial /l/ to be overridden. Finally, they argue that the fact that cues to coda voicing extend throughout the syllable lends support to a holistic rather than phoneme-based model of word recognition.
In Chapter 3, Jennifer Hay, Janet Pierrehumbert and Mary Beckman describe experiments in which subjects listened to imaginary words created by cross- splicing, with items containing common medial nasal-obstruent clusters, e.g. /slentu/, contrasted with less likely or impossible ones, e.g. /slemku/. The listeners reported how well-formed they judged the words and also transcribed what they heard. It was found that the perceived well-formedness of a word is related in a gradient fashion to the likelihood of its occurrence, and this can broadly be predicted from counts of attested phoneme combinations in a database such as CELEX.
In Chapter 4, Richard Wright investigates the acoustic quality of vowels in "easy" words (like 'gave') and "hard" words (like 'mace'), where easy/hard is determined not just by frequency of occurrence but also by the number of words that sound similar, and he finds that there is greater dispersion in the F1/F2 space for the vowels of hard words, especially those with a point vowel such as /i, a, u/. Furthermore, he reports that his speakers differed substantially, with one male and one female speaker having greater reduction in dispersion for the easy vowels than the other eight speakers.
In Chapter 5, John Coleman comments on Chapters 2, 3 and 4 and observes that although Laboratory Phonology is more of a way of doing phonology than a theory in its own right, it does present a welcome antidote to the shortcomings of generative phonology. With regard to Chapter 2, he suggests that if voicing is treated as a property of the rime rather than the coda, then influences of the voicing of a final plosive on an initial /l/ are no longer particularly surprising, and this lends support to a syllable-based representation. And he argues that the findings of Chapters 3 and 4 provide strong evidence for the need to include probabilistic data in phonological representations, as deterministic rule- based systems cannot predict patterns of well-formedness or the fine details of articulation.
The papers in Part II investigate the influence of phrase structure on the articulation of sounds, particularly consonants.
In Chapter 6, John Harris argues against rule-based, derivational accounts of ambisyllabicity, and with a representation using element theory, where each basic element such as (H) 'high source' (resulting in aspiration) and (U) 'labial' (characterised by lowered F2) has a direct and independent acoustic interpretation, he shows that the foot is the appropriate domain for representing phonetic phenomena such as the lenition of consonants in Danish and Ibibio.
In Chapter 7, Mariapaola D'Imperio and Barbara Gili Fivela investigate Florentine Italian for the effects of a clause or phrase boundary and also of narrow, contrastive focus on the occurrence of Raddoppiamento (Fono-) Sintattico (RF), the lengthening of an initial consonant triggered by a stressed final vowel in the preceding word, and they find that a clause boundary does block RF as expected, but that a phrase boundary and narrow focus do not always have the predicted effect.
In Chapter 8, Patricia Keating, Taehong Cho, Cecile Fougeron and Chai- Shune Hsu use electro-palatography (EPG) to compare the effects of various kinds of phrasal boundary on the duration and degree of contact between the tongue and the roof of the mouth for syllable-initial /n/ and /t/ in French, Korean and Taiwanese, and they show that although the levels of phrasing vary for the different languages, all speakers make at least one distinction which has a substantial effect on the articulation of initial alveolar consonants.
In Chapter 9, Robert Ladd and James Scobbie investigate the duration of various consonants in Sardinian to see whether postlexical geminates (PLGs), the long consonants that occur as a result of assimilation, are identical to geminates that originate in the lexicon. They report that, unlike the situation for many types of assimilation in English for which there are residual effects from the underlying sounds, Sardinian PLGs are the same as lexical geminates with no residual effects, and so they conclude that gestural overlap does not provide a suitable model for the assimilatory patterns of Sardinian. However, they do concede that gestural overlap may explain some cases of residual nasalisation in Sardinian.
In Chapter 10, Jonathan Harrington comments on Chapters 7, 8 and 9. First, for Chapter 9, he observes that all final consonants in Sardinian are alveolar, so place of articulation can be left unspecified, and this means that the PLG data have no bearing on the kind of assimilation in English that originally gave rise to accounts of gestural overlap. Next, he reports that although a mora-based model may be suitable for describing the Sardinian data, such a model does not work for the RF data from Chapter 7, as there is no evidence that RF in Florentine Italian shows the categorical shift that mora relinking would entail, and he further suggests that, rather than a purely syntactic analysis, the RF data might be investigated using various intonational break indices. Finally, for Chapter 8, he proposes that the differences in articulation found for initial consonants in various languages are perceptually based (something the authors themselves are equivocal about), and he further notes that it would be valuable to extend the work by investigating indigenous Australian languages which favour VC syllables.
Part III is concerned with syllable structure, particularly the timing and quality of initial and final consonantal gestures.
In Chapter 11, Terrance Nearey reports first on simulations to investigate the factors that work best in speech perception of syllables in noisy conditions, and he concludes that segment-sized units work best. Then he investigates the combinations of acoustic factors that can best model the previously-reported perceptual responses of listeners to stimuli spanning the /bla, dla, bra, dra/ continuum, and he finds that a segmental model which includes quadratic effects for F2 and F3 works best, and there is no clear need to include any terms for diphones.
In Chapter 12, Bryan Gick measures lip aperture and the position of various parts of the tongue for initial, ambiguous and final /w,j,l/ in American English, for example in 'ha wadder' (initial /w/), 'how otter' (ambiguous /w/) and 'how hotter' (final /w/), and he finds that lip aperture for ambiguous /w/ behaves like the tongue tip for /l/ in showing evidence of resyllabification, but none of the measurements for ambiguous /j/ undergo such a shift. He concludes that, if lip aperture for /w/ is treated as a consonantal gesture, his data support a model based on gestural overlap, with resyllabification involving retiming of the consonantal and vocalic gestures, and he finally suggests that /l/ and /w/ (and also /r/) behave like consonants while /j/ is purely a vowel.
In Chapter 13, Paul Carter measures F2 as an indication of the darkness of [l] and [r] for one speaker of each of four dialects of British English, representing the four combinations of rhotic/non-rhotic and clear/dark initial [l]. For the non-rhotic varieties, he confirms earlier reports that dark initial [l] is found with clear [r] while clear initial [l] is paired with dark [r], but for the rhotic varieties, this pattern is not found, as both initial [l] and initial [r] are dark for the speaker from Fife, though this speaker does have a clear final [r]. Carter also measures formant transitions as an indication of the timing of apical and dorsal gestures, and he finds that, for a dark initial [l], the dorsal gesture is timed at or before the apical gesture, which indicates that vocalic gestures do not necessarily occur closer to the syllable peak than consonantal gestures.
In Chapter 14, Kenneth de Jong investigates the timing of /p/ and /b/ in onset and coda positions, with 'pea', 'eep', 'bee' and 'eeb' repeated at varying speech rates dictated by a metronome, and he finds that, although at fast speech rates coda /p/ tends to become perceptually similar to onset /b/, some aspects of the coda enunciation remain, so there is not a complete switch from coda consonant to onset consonant as previously claimed. Furthermore, he reports that, with changing speaking rates, there is a delay in the shifting of patterns, so that once speakers start to produce a pattern, they tend to continue with it.
In Chapter 15 Peter Ladefoged raises a number of questions about Chapters 12, 13 and 14. For Chapter 12, he notes that treating [l] as a combination of vocalic and consonantal gestures only works for American English, as in his own pronunciation of British English there is no raising of the back of the tongue for initial [l] nor any contact between the tip of the tongue and the alveolar ridge for final [l], so for him, initial and final [l] must be treated as separate gestures, or, in traditional terms, as extrinsic allophones of /l/. For Chapter 13, although he praises the elegant overall treatment, he questions if it is adequate to have one speaker for each dialect, as a single informant may have idiosyncratic speech patterns. And for Chapter 14, he notes that only stressed syllables were studied, and consonants can behave differently in unstressed syllables. He finally suggests that we might even consider treating 'happy' and 'supper' as single syllables, and, referring to data from Scottish Gaelic and Montana Salish, he proposes that syllables may in fact be totally irrelevant constructs in some circumstances.
Part IV covers miscellaneous topics in speech production.
In Chapter 16, Bushra Adnan Zawaydah uses an endoscope to investigate the articulation of oral, pharyngeal, and guttural consonants in Jordanian Arabic, and she reports that the gutturals are characterised by narrower pharyngeal diameter than non-gutturals. In contrast, for Interior Salish, she claims that lowered first formant is needed to describe the grouping of consonants. She thus concludes that articulatory features are necessary for classification of the guttural consonants in Arabic while acoustic features are more appropriate for Salish languages.
In Chapter 17, Daniel Silverman manipulates recordings of words in Jalapa Mazatec, a Mexican language which is characterised both by a range of tones and also by breathy and modal phonation. He reports that listeners are able to perceive pitch contrasts more clearly on modal vowels than breathy vowels, and he suggests that this explains why, for vowels in Mazatec which consist of a breathy portion followed by a modal portion, the tonal contrasts only occur during the modal portion.
In Chapter 18, Katrina Haywood, Justin Watkins and Akin Oyetade investigate the H, M and L tones of Yoruba, to see whether two of them can be grouped together in a separate register. On the basis of various acoustic measurements and also the closed quotient from a laryngograph waveform, they conclude that the L tone is indeed characterised by a distinctive voice quality, so it might be analysed as belonging to a different register than the other two, though unexpectedly they find that the L tone has greater spectral tilt than the M or H tones.
In Chapter 19, Keiichi Tajima and Robert Port adopt "speech-cycling" methodology, using a metronome to guide speakers of English and Japanese in producing utterances with a fixed number of syllables in a waltz-timed beat, and then they see what happens to the timing when the middle syllables are manipulated, either by switching them around or by introducing an extra syllable. They report that the English speakers tend to maintain a stress-timed rhythm, but the rhythmic basis of the Japanese utterances varies when an extra syllable is introduced, with subjects aligning their speech with the rhythmic beat in different ways, and it is not clear if mora-timing is the best way to categorise the timing of Japanese.
Finally, in Chapter 20, Gerard Docherty reviews the papers in Part IV and asks three important questions. Firstly, to what extent do the results of laboratory work describe the characteristics of natural speech? He believes that natural data are important, and he particularly raises concerns about the artificiality of the speech-cycling data from Chapter 19. Secondly, is it true that speakers always strive to maintain maximal contrasts in their speech? For example, do the data from Chapters 17 and 18 on tonal realisations indicate that speakers actually make use of the enhanced auditory options that are available? He cites data for the NURSE vowels and final plosives in natural speech from Tyneside to show that speakers often do not maintain contrasts, as social factors may outweigh the desire to achieve maximal clarity. Thirdly, if a phonetic attribute is found to co-occur with a phonological feature, can we be sure that the two are linked? Particularly in Chapter 16, do the co-occurrence of an articulatory or acoustic feature with a set of consonants for Arabic and Salish really indicate that the consonants are grouped using those features?
CRITICAL EVALUATION
This volume presents a succinct and impressive overview of recent research into phonology under laboratory conditions. While the wealth of material that is packed into a single volume is admirable and will be highly valued by many, others may find the brevity of some of the papers a little frustrating. There are regular comments from the authors that "due to space limitations" (p. 258) "we do not have the space to report further" (p. 172), and for example we are informed that "vowel-duration measurements followed standard procedures" (p. 134) without being told what those procedures were. Often there is discussion of results that are "not shown in the figure" (p. 154, p. 156), reference is made to plots that "we do not show" (p. 66), and many of the authors acknowledge that their chapter is a summary of some other more comprehensive account "which may be consulted for more details" (p. 41). In many cases, one feels that it is necessary to get hold of the full report published elsewhere to understand the research fully, though that is not helped when we are advised to "see Gick, forthcoming, for detailed discussion of this matter" (p. 226) but 'Gick forthcoming' is not actually included in the References at the back of the book despite at least five citations in the text (p. 222 twice, p. 226 twice, p. 233).
Unfortunately, there are also quite a few errors which exacerbate the difficulties in interpreting some of these papers. Many are merely irritating, with misspelled words in labelling the figures ('onomorphemic' p. 69; 'vvoicing' p. 261) and erroneous cross-references (Section 2.2.1 instead of 12.2.1, p. 227; Experiment 3 rather than 2, p. 63). Sometimes, these errors affect the detailed description of data, with 'crank' transcribed with an initial /c/ and 'sermon' and 'syrup' listed as sharing two segments while the transcription actually indicates three shared segments (p. 19), and all items beginning with /str/ and /gr/ are transcribed with a turned-r (p. 61), while those beginning with /kr/ have a lower-case-r, with this distinction retained later in the text (p. 65) even though there seems to be no logic behind it. Finally, spelling of 'histeresis' with the more usual 'y' instead of the first 'i' (p. 262 ff) would be helpful to those of us who need to look the word up in a dictionary.
There are a few problems with the data in Tables. Often this just involves misaligned text (Table 7.1, p. 135; Table 18.1, p. 313; 'm.sg' in item 5, p. 167; an extra word 'atom' in item 9, p. 113), but occasionally something is wrong with the numbers, so in Table 18.1 (p. 313), 0.6 is given as the mean of 5.4, 0.1 and 3.9, and even more bizarrely the whole of the second last line is wrong, with for example 170 given as the mean of 24, 23 and 28.
Some of the errors are not just irritating but seriously disrupt interpretation of the material. In Nearey's paper, Table 11.2 lists Model II as an enhancement of Model II (is it really a recursive model?) with the addition of G x F3 (one assumes it might really be an enhancement of Model I with the addition of G x F2). Moreover, on p. 216, three references are made to Row 7 of Table 11.3, the first two suggesting it compares Models III and V, and the third discussing its comparison of Models V and VI, while the table itself shows Row 7 as comparing Models IV and VI, so it is hard to determine which is correct: the text or the table. The compactness of the presentation in this chapter, the admission that the full "simulations are described in detail" in another paper (p. 200) and frequent references to "further simulations sketched" elsewhere (p. 204), and the existence of so many errors makes this paper rather difficult to understand.
In contrast, many of the papers are very well presented, with a comprehensive description of all the data. A model of clarity is Wright's chapter, where even the detailed methodology of the formant measurements is reported in full (something that is unfortunately rarely done in research papers of this nature). One might question a couple of things, for example if 12th order linear prediction is sufficient for formant measurements when the sampling rate is 22,050 Hz (p. 80), as Ladefoged (2003:125) suggests an order of between 20 and 24 would be more appropriate; and in Figure 4.2 (p. 82) the "hard" /E/ vowel (in 'den', 'wed' and 'pet') seems to be a bit further from the centre of the vowel space than its "easy" counterpart, while the bar chart in Figure 4.3 on the same page shows the easy version of /E/ as more peripheral. But these are minor quibbles in an otherwise excellent paper.
One might question the interpretation of the data in one or two other places. For example, although Carter's paper is mostly carefully presented and well argued, his conclusion that "[i]nitial laterals are clearer than final laterals" (p. 245) is not supported by the plot for the speaker from Fife, whose initial [l] appears to have a very slightly lower F2 than his final [l] (p. 244). And the claim (p. 245) that initial [r] is darker than initial [l] for this speaker is open to doubt, as the two values in Figure 13.3 are rather close and the error bars overlap, so one might assume that there is no significant difference.
Even though some of the chapters are rather compact, this book does represent an exceptionally valuable compilation of recent laboratory work on various aspects of phonological representation. Furthermore, the four commentaries by Coleman, Harrington, Ladefoged and Docherty offer insightful discussion of the issues and provide critical but thoroughly constructive evaluations of the research. In particular, the discussions by Ladefoged and Docherty represent a real breath of fresh air, raising some important questions about the research, particularly with regard to the number of speakers involved in the data and also the applicability of results obtained under laboratory conditions to the interpretation of real speech. While it is undoubtedly true (as acknowledged by Docherty, p. 343) that the invasive nature of Zawaydeh's work with an endoscope means that she could only realistically study her own articulation, it is still a genuine concern that much of the research depends on so few speakers producing somewhat contrived utterances in artificial conditions. Not only, as observed by Docherty, do the requirements for Tajima and Port's speakers to rehearse the data beforehand lead to doubts about naturalness, but one might also note that Ladd and Scobbie obtained data for Sardinian using prompts in English for one speaker and, for the other two, an invented phonetic script that one of them found hard to use (p. 171), and Keating et al recorded data for Taiwanese using prompts in Mandarin, and furthermore for /n/ this involved repetition of the syllable /na/ (p. 158). Does this really result in genuine speech data, or do we have some kind of artificial laboratory construct?
However, small-scale meticulous investigations using carefully designed, innovative data are the at the core of most work in laboratory phonology, and furthermore the focus of this kind of research is generally to devise ingenious fresh ways of investigating speech in order to tease out details of the nature of phonological representation, so it is perhaps not surprising if the data at times are somewhat artificial. Furthermore if prepared data are to be recorded for languages that are rarely written, such as Sardinian and Taiwanese, then we have to accept that non-ideal prompts must be used.
It is certainly true that studies such as those reported in this book provide fascinating and invaluable evidence about the nature of speech, and in conclusion, the collection of papers in this volume, particularly when accompanied by the four insightful commentaries, represents a very useful overview of some of the laboratory investigations into speech being undertaken around the world.
REFERENCES
Broe, Michael B & Pierrehumbert, Janet B (2000) Papers in Laboratory Phonology V: Acquisition and the Lexicon, Cambridge: Cambridge University Press.
Connell, Bruce & Arvaniti, Amalia (1995) Phonology and Phonological Evidence: Papers in Laboratory Phonology IV, Cambrdige: Cambridge University Press.
Gussenhoven, Carlos & Warner, Natasha (2002) Laboratory Phonology 7, Berlin: Mouton de Gruyter.
Ladefoged, Peter (2003) Phonetic Data Analysis: An Introduction to Fieldwork and Instrumental Techniques, Malden MA, Blackwell.
ABOUT THE REVIEWER:
ABOUT THE REVIEWER
David Deterding is an Associate Professor at NIE/NTU, Singapore, where he
teaches phonetics, phonology, syntax, and Chinese-English translation.