LINGUIST List 8.1499

Sat Oct 18 1997

Review: Alexander, A Phonetic Model of Intonation

Editor for this issue: Andrew Carnie <carnielinguistlist.org>




What follows is another discussion note contributed to our Book Discussion Forum. We expect these discussions to be informal and interactive; and the author of the book discussed is cordially invited to join in. If you are interested in leading a book discussion, look for books announced on LINGUIST as "available for discussion." (This means that the publisher has sent us a review copy.) Then contact Andrew Carnie at carnielinguistlist.org

Directory

  • Karen S. Chung, Re: Book review: _A Phonetic Model..._

    Message 1: Re: Book review: _A Phonetic Model..._

    Date: Sat, 18 Oct 1997 11:53:18 +0800 (CST)
    From: Karen S. Chung <karchungccms.ntu.edu.tw>
    Subject: Re: Book review: _A Phonetic Model..._


    Book Review: Taylor, Paul Alexander. 1994. _A Phonetic Model of Intonation in English_. Bloomington: Indiana University Linguistics Club. 171pp. Paper.



    This book is a revised version of the author's Ph. D. dissertation presented to the University of Edinburgh in 1991. Its main thrust concerns getting a computer model to produce accurate intonation contours from F0 (fundamental frequency) input. More specifically, the author wishes to derive 'higher level' phonological information from phonetic (F0) data without the subjective intervention of human 'labelers', and also to be able to reverse the process and obtain phonetic from phonological information. The author justifies his work by citing the usefulness of having information in different forms.

    The author posits three levels of description: the F0 level, the intermediate (phonetic) level (which 'may exist merely to break the complicated mapping into two easier mappings') and the phonological level, then offers a 'fully specified grammar' ('a device which relates one level to another') to link the three levels via 'mapping' ('a process which uses a grammar in a particular direction', i.e. phonetic to phonological, or vice-versa).

    Before detailing his own experimental work, the author first surveys previous work in phonetic modeling of intonation, including the British, Dutch, and Pierrehumbert schools, and the Fujisaki Model. He compares the models in terms of their success in producing correct intonational contours, and points out the strengths and failings of each. He concludes that none is sufficiently powerful enough to formally link F0 data and intonational phonology; and this is really the starting point of his own proposed model - which is to some extent successful in ways the others fail; yet it is ultimately 'not good enough' either, in its current stage of development.

    The book details procedures undertaken to correct various inaccuracies in the modeling. For example, unvoiced consonants which caused breaks in the computer output were filled in with a straight line. Because of the distortion this causes, unvoiced consonants were largely omitted from the author's Data Set A; but not from Data Set B, thus the results for Set B were less accurate than for Set A. The book also details the problems of 'insertion' and 'deletion' errors and how the sampling rate was adjusted to keep both kinds of errors at a minimum.

    The book is divided into five chapters, entitled: Introduction, A Review of Phonetic Modeling, A New Phonetic Model of Intonation, Computer Implementation of the New Model, and Conclusions. The work also includes five appendices. The first is the texts of the two sets of speech data used in the development of the phonetic model. These were designed to encompass 'all the pitch accents of [British, one must assume] English' and were spoken by the author. The sentences in Data Set A are largely conventional ones, some taken from the literature (example: 'Do you really need to win everything you do?'; different versions with varying stress patterns were used). Most of these are marked as to the shape of the nuclear accent (indicated by an asterisk): e.g. fall-rise, low-rise, high-rise, surprise-redundancy. The Data Set B sentences are rather more engaging, since they were culled from a UNIX BBS (example: 'Looks to me as though your mind rot has already set in.'); these were read by a second speaker who was familiar with Net language and who was trained to produce the sentences in a natural manner. Appendix B includes sample graphs of the original F0 contours, the hand-labeled contours, and the automatically labeled transcription. At least in these examples, the phonetic model used seems to have produced results similar to the original F0 contours. Appendix C gives the mathematical derivation of the monomial function, D presents computer implementation details, and E is a list of publications by the author. This is followed by a 5-page bibliography.

    The written style of this work is fluent and clear overall, with only a few typos turning up here and there; the author, however, seems to assume considerable background knowledge on the part of the reader, and dives directly into descriptions of his work without a lot of preliminaries or definitions. For instance, the opening sentence of Chapter 1, "Introduction", reads: 'Phonetic modeling concerns the relationship between two different representations of intonation: fundamental frequency and phonology." OK, so we know what phonetic modeling 'concerns'; but what *is* it? Readers who do not belong to the inner circle of this kind of research are left to their own devices - if indeed they are motivated to read the work. Eventually the idea becomes clearer; but it could possibly take considerable effort for a good number of linguists with other specialties.

    Because the book is a technical, blow-by-blow description of the author's experimental procedures and results, it is not exactly 'pleasure reading', and unless one is doing similar research oneself, it is hard to imagine that one would want to spend time to read the work carefully in its entirety. I was initially interested in the book since English intonation is one of my own teaching and research interests; however, I was somewhat disappointed to find a long discourse on why computer processing of F0 data cannot in fact provide as reliable a description of intonational patterns in English as a human transcriber.

    Taylor's work certainly makes a notable contribution to the development of computerized speech data processing; but it is clear that the models presented are not yet sophisticated enough for practical use. One can hope that future research will continue to refine Taylor's and other researchers' procedures so as to advance our understanding of intonation, and enhance the language processing capability of computers, which includes applications in speech synthesis and recognition.

    By Karen Steffen Chung, Instructor, Department of Foreign Languages and Literatures, National Taiwan University.