LINGUIST List 5.256

Sun 06 Mar 1994

Review: The Oxford Acoustic Phonetic Database on Compact Disk

Editor for this issue: <>


Directory

  • Ian MacKay, Review of The Oxford Acoustic Phonetic Database on Compact Disk

    Message 1: Review of The Oxford Acoustic Phonetic Database on Compact Disk

    Date: Mon, 21 Feb 94 12:25:02 ESReview of The Oxford Acoustic Phonetic Database on Compact Disk
    From: Ian MacKay <IMACKAYacadvm1.uottawa.ca>
    Subject: Review of The Oxford Acoustic Phonetic Database on Compact Disk


    Content-Length: 5737

    The Oxford Acoustic Phonetic Database on Compact Disk By J. B. Pickering and B.S. Rosner 1993 Oxford University Press

    Reviewed by Ian MacKay (imackayacadvm1.uottawa.ca)

    The title of this work, Acoustic Phonetic Database, might be construed as suggesting that it consists of an attempt to present exemplars of the range of sounds of human language. However, the authors' goal was quite a different one. What they have compiled is a database of words containing monophthongs in 7 languages. Therefore, while the database contains many consonants, its goal is to present native- speaker exemplars of simple vowels in context. The database will permit acoustic phonetic research on vowels from a standard set of recordings, thereby dealing with questions as to whether different researchers' conclusions result from differences in technique or in the data studied; issues of phonetic context (both segmental and suprasegmental), register, dialect, talker age, talker sex, and talker size (which correlates to vocal tract length) create such a tapestry of variables that, particularly in the medium of print, they cannot be satisfactorily dealt with. However, with a standard reference set such as this, there now exists a standard to which comparisons can be made, to say nothing of the availability of a vast database for direct acoustic descriptive work. The authors suggest that the databases will also be useful to those engineers working on automatic speech recognition, as well as to psychologists, in addition to the obvious utility to linguists.

    The publication consists of a phonetic database on 2 CD ROMs and an accompanying manual. The collection and wide dissemination of a database such as this is made possible by CD ROM, which permits precise acoustic control and safeguards the integrity of the data. Distribution by magnetic tape or vinyl disk, a technique that has been employed in the past for some phonetic and other demonstration material, has all of the fidelity and signal-to-noise ratio problems inherent in analogue materials, as well as problems of wow and the precision of playback speed.

    The 7 languages in fact means 8 databases, since both an American and a British dialect of English are included. The databases include 10 vowels in American English, 11 vowels in British English, 10 vowels in French, 14 in German, 14 in Hungarian, 7 in Italian, 10 in Japanese, and 5 in Spanish. Some choices seem arbitrary: nasalized monophthongs in French are excluded; [OU] and [Ei] in English are excluded, but [ij] and [uw], which are typically diphthongal as well, are included, presumably because they are closer to having a monophthongal character. The dialects chosen are generally the most prestigious or best-known: RP British, Castilian Spanish, Northern German (Hochdeutsch), Northern Italian, Tokyo Japanese, etc. The choice of languages was surely in part a matter of practicality, but the attempt has been to include, among IE languages, representatives of Germanic and Romance languages, and two non-IE languages as well.

    The authors rejected nonsense words in collecting their data. They determined the phonotactically-permissible environments for the target vowels in each language (typically: stressed VC and CV or CVC, and unstressed VC and CV), and then sought words that furnished that context for the target vowels. The informants ("speakers") pronounced these words, and the isolated vowels as well (in some languages, of course, the isolated vowels are also lexical items). Taking into account the variety of environments, the American English inventory includes 694 words; 794 in RP; 566 in French; 740 in Hochdeutsch; 957 in Budapest Hungarian; 442 in Italian; 479 in Japanese; and 382 in Spanish. These figures give some appreciation of the scope of the databases, which are truly enormous.

    Similar care was given to the informant characteristics. Each word list was produced by 8 talkers, 4 female, 4 male. They were roughly matched for stature; exact heights are given. They are also roughly matched for age in order to avoid variability due to historical change in progress.

    Details of the recording and digitizing process are provided. The recordings have 12-bit depth and a 10-kHz digitization.

    Most of the 200-page manual is given over to the listings of words. For each language database, the vowels are listed by vowel and context, by alphabetical order, and by the numerical order of the test words.

    The CD ROMs were designed for usage in a DOS environment in conjunction with such an analysis program as CSRE (Canadian Speech Research Environment). Usage with a Macintosh is less transparent, and involves the use of FileConverter (still, the CD ROMs mount on a Macintosh and show directory contents straighforwardly). The Macintosh-converted files can then be accessed by a waveform editor. The authors suggest the use of Signalize; a description of Signalize was posted on LINGUIST in February 1994.

    This work represents an attempt to create an accessible database collected under closely controlled conditions and usable by those having access to what is now considered quite pedestrian equipment, namely a PC with a CD ROM player. (One improvement that could be made would have been the inclusion of software that would permit playback without having specialized analysis software.) The endproduct represents the accomplishment of an impressive undertaking, and provides a tool of considerable utility.

    NOTE: The reviewer would like to apologize to editors and subscribers of LINGUIST, as well as to the publisher and authors, for the delay in posting this review. Obviously, the advantage of an electronic forum such as LINGUIST is the timely posting of material.