Book Review: Taylor, Paul Alexander. 1994. _A Phonetic Model of
Intonation in English_. Bloomington: Indiana University Linguistics Club.
This book is a revised version of the author's Ph. D.
dissertation presented to the University of Edinburgh in 1991. Its main
thrust concerns getting a computer model to produce accurate intonation
contours from F0 (fundamental frequency) input. More specifically, the
author wishes to derive 'higher level' phonological information from
phonetic (F0) data without the subjective intervention of human
'labelers', and also to be able to reverse the process and obtain phonetic
from phonological information. The author justifies his work by citing the
usefulness of having information in different forms.
The author posits three levels of description: the F0 level, the
intermediate (phonetic) level (which 'may exist merely to break the
complicated mapping into two easier mappings') and the phonological level,
then offers a 'fully specified grammar' ('a device which relates one level
to another') to link the three levels via 'mapping' ('a process which uses
a grammar in a particular direction', i.e. phonetic to phonological, or
Before detailing his own experimental work, the author first
surveys previous work in phonetic modeling of intonation, including the
British, Dutch, and Pierrehumbert schools, and the Fujisaki Model. He
compares the models in terms of their success in producing correct
intonational contours, and points out the strengths and failings of each.
He concludes that none is sufficiently powerful enough to formally link F0
data and intonational phonology; and this is really the starting point of
his own proposed model - which is to some extent successful in ways the
others fail; yet it is ultimately 'not good enough' either, in its current
stage of development.
The book details procedures undertaken to correct various
inaccuracies in the modeling. For example, unvoiced consonants which
caused breaks in the computer output were filled in with a straight line.
Because of the distortion this causes, unvoiced consonants were largely
omitted from the author's Data Set A; but not from Data Set B, thus the
results for Set B were less accurate than for Set A. The book also details
the problems of 'insertion' and 'deletion' errors and how the sampling
rate was adjusted to keep both kinds of errors at a minimum.
The book is divided into five chapters, entitled: Introduction, A
Review of Phonetic Modeling, A New Phonetic Model of Intonation, Computer
Implementation of the New Model, and Conclusions. The work also includes
five appendices. The first is the texts of the two sets of speech data
used in the development of the phonetic model. These were designed to
encompass 'all the pitch accents of [British, one must assume] English'
and were spoken by the author. The sentences in Data Set A are largely
conventional ones, some taken from the literature (example: 'Do you really
need to win everything you do?'; different versions with varying stress
patterns were used). Most of these are marked as to the shape of the
nuclear accent (indicated by an asterisk): e.g. fall-rise, low-rise,
high-rise, surprise-redundancy. The Data Set B sentences are rather more
engaging, since they were culled from a UNIX BBS (example: 'Looks to me as
though your mind rot has already set in.'); these were read by a second
speaker who was familiar with Net language and who was trained to produce
the sentences in a natural manner. Appendix B includes sample graphs of
the original F0 contours, the hand-labeled contours, and the automatically
labeled transcription. At least in these examples, the phonetic model used
seems to have produced results similar to the original F0 contours.
Appendix C gives the mathematical derivation of the monomial function, D
presents computer implementation details, and E is a list of publications
by the author. This is followed by a 5-page bibliography.
The written style of this work is fluent and clear overall, with
only a few typos turning up here and there; the author, however, seems to
assume considerable background knowledge on the part of the reader, and
dives directly into descriptions of his work without a lot of
preliminaries or definitions. For instance, the opening sentence of
Chapter 1, "Introduction", reads: 'Phonetic modeling concerns the
relationship between two different representations of intonation:
fundamental frequency and phonology." OK, so we know what phonetic
modeling 'concerns'; but what *is* it? Readers who do not belong to the
inner circle of this kind of research are left to their own devices - if
indeed they are motivated to read the work. Eventually the idea becomes
clearer; but it could possibly take considerable effort for a good number
of linguists with other specialties.
Because the book is a technical, blow-by-blow description of the
author's experimental procedures and results, it is not exactly 'pleasure
reading', and unless one is doing similar research oneself, it is hard to
imagine that one would want to spend time to read the work carefully in
its entirety. I was initially interested in the book since English
intonation is one of my own teaching and research interests; however, I
was somewhat disappointed to find a long discourse on why computer
processing of F0 data cannot in fact provide as reliable a description of
intonational patterns in English as a human transcriber.
Taylor's work certainly makes a notable contribution to the
development of computerized speech data processing; but it is clear that
the models presented are not yet sophisticated enough for practical use.
One can hope that future research will continue to refine Taylor's and
other researchers' procedures so as to advance our understanding of
intonation, and enhance the language processing capability of computers,
which includes applications in speech synthesis and recognition.
By Karen Steffen Chung, Instructor, Department of Foreign
Languages and Literatures, National Taiwan University.