LINGUIST List 3.51

Mon 20 Jan 1992

Disc: Automatic Transcription

Editor for this issue: <>


  1. Richard Sproat, IPA Computer Transcription
  2. Richard Ogden, IPA and transcribing machines

Message 1: IPA Computer Transcription

Date: Wed, 15 Jan 92 23:37:17 ESIPA Computer Transcription
From: Richard Sproat <>
Subject: IPA Computer Transcription

In response to Barbara Ruth Campbell's query concerning the
possibility of computer transcription of speech into IPA.

The problem is probably too hard for current technology. That is, it
is not currently possible to build a speaker-independent system that
will do a sufficiently accurate job of phonetic transcription to be
useful for the purpose of pointing out phonetic errors made by
second-language learners. If the system knows the text to be spoken,
and therefore has an idea of the target sequence of phone(me)s, it is
possible to do a fairly credible job of segmenting the text.
Furthermore if some alternative pronunciations for words (e.g.,
`butter' with or without a flap) are included in the system's lexicon,
then the system will probably do pretty well in many cases at
identifying the pronunciation that was actually said. But I doubt that
any current system could perform well enough to offer a decent
transcription in a case when a speaker does something unexpected, as
could well be the case with second language learners.

However, there has been some work on computer-based spoken language
teaching systems. The work that I know of is by Jared Bernstein at SRI
(, is his email address, I think). I do not think that
it is anywhere near the stage of what you describe, but it is the only
work that I am aware of that is pointing in that direction.

Richard Sproat
Linguistics Research Department
AT&T Bell Laboratories
600 Mountain Avenue, Room 2d-451
Murray Hill, NJ 07974
tel (908) 582-5296
fax (908) 582-7308
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue

Message 2: IPA and transcribing machines

Date: Thu, 16 Jan 92 9:46 GMT IPA and transcribing machines
From: Richard Ogden <>
Subject: IPA and transcribing machines

Barbara Ruth Campbell's question about whether a speech recognition system
could make IPA transcriptions that could then be compared reveals a few
widespread misunderstandings about the IPA alphabet and the nature os

IPA symbols as they appear on the IPA chart are meant to represent 'cardinal'
sounds. You learn them in a way that is similar to being ordained in a
church - you've got to learn it frmo someone who knows, or else the symbols
have dozens of slightly differing interpretations, which defeats one of
the points of the IPA alphabet. There's nothing to say that these 'cardinal'
sounds will not appear in languages, but equally there's nothing to say that
they won't. Usually it's necessary when making an impressionistic transccription
to use diacritics as well as symbols. In a 'broad' transcription the
transcription needs some accompanying notes, eg '[r] is used to denote an
alveolar approximant with an accompanying dark resonance'.
The IPA chart does not describe every sound of human speech, although we
can approximate this by combining symbols with diacritics, which we might
(by convention, and in broad transcriptions) leave out. But the diacritics
or accompanying notes can be very important.
The incomplete nature of the IPA has to be accepted, and we have to live
with its limitations.
If you compare two broad transcriptions of two speakers the differences you
find might be rather slight, or misleading. Broad transcriptions typically
leave out the diacritica. The differences are thus in the symbols used
in the transcription (which is *not* the same as saying that the two spoken
texts were the same except for 'segments' a, b and c.) But there might
be other differences which are ignored in broad transcription which are
equally important - portions of speech that are nasalised, or velarised
or have a particular voice quality, or slight differences in vowel quality
(like retracted vowels in certain contexts).
The question now is - what cuonts as a 'segment'? Broad transcriptions give us
the impression that language is full of segments, but the IPA symbols stand
for a relatively arbitrary set of things; it is kind of implicit in the IPA
that diacritics stand for 'less important' things and can helpfully be left
out. This is not (as I understand it) what was originally intended. How
many segments in an utterance of 'cat'? 3? 4? (aspiration?) more? less?
it surely depends on our understanding of the nature of transcriptions.

To sum up: transcriptions need to be interpreted (and made) with care; their
interpretation and comparison is not as straightforward as it seems super-
ficially; the nature of the transcription has to be made explicit before
much sense can be drawn from it; transcription is one form of representing
speech and comparing transcriptions should not be simplistically equated
with comparing tokens of speech.

Richard Ogden
Experimental phonetics laboratory
University of York, BG
(sorry GB!)
Mail to author|Respond to list|Read more issues|LINGUIST home page|Top of issue