Publishing Partner: Cambridge University Press CUP Extra Wiley-Blackwell Publisher Login
amazon logo
More Info


New from Oxford University Press!

ad

Language Planning as a Sociolinguistic Experiment

By: Ernst Jahr

Provides richly detailed insight into the uniqueness of the Norwegian language development. Marks the 200th anniversary of the birth of the Norwegian nation following centuries of Danish rule


New from Cambridge University Press!

ad

Acquiring Phonology: A Cross-Generational Case-Study

By Neil Smith

The study also highlights the constructs of current linguistic theory, arguing for distinctive features and the notion 'onset' and against some of the claims of Optimality Theory and Usage-based accounts.


New from Brill!

ad

Language Production and Interpretation: Linguistics meets Cognition

By Henk Zeevat

The importance of Henk Zeevat's new monograph cannot be overstated. [...] I recommend it to anyone who combines interests in language, logic, and computation [...]. David Beaver, University of Texas at Austin


Summary Details


Query:   Sum: Reactions to synthetic speech
Author:  Bente Henrikka Moxness
Submitter Email:  click here to access email
Linguistic LingField(s):   Phonetics

Summary:   A number of weeks ago I asked very informally for people's reactions
to synthetic speech (also prerecorded speech) and for studies on
emotional reactions to synthetic speech. I wish to thank those who
responded:

Osamu Fujimura
Margaret Jackman
Randall A. Major
Corey Miller
Johanna Rubba
Stephen P. Spackman

I had hoped for more responses, but I have started to collect
information from friends and colleagues as well. I've realized that I
need a more structured way of gathering info - with the possiblitiy
that this (up until now) rather informal approach to the matter may
suddenly turn into a more formal study. The respondents react both
positively and negatively to synthetic speech; one may be irritated at
the "bluntness" of the machine, the lack of flexibility in the
programs, etc. but still find the synthetic vocal information handy.
>From Per Egil Heggtveit at Telenor, Norway, I have received a list of
references on synthetic speech, but none of the stuides cover
emotional reactions.

Osamu Fujimura wrote:

>I suggest that you ask the question to Marian Macchi.

I did. She responded the following:

>Two of the US telephone companies have
>introduced a service called "Reverse Directory Assistance", which
>is available to telephone customers. This is a telephone service whereby
>a customer calls a special number, enters a telephone number using
>the touchtone pad, and hears the name and address of the person to
>whom that telephone number is listed. A speech synthesizer (Orator,
>a text-to-speech synthesizer that we have developed here at Bellcore)
>is used to speak the name and address.
>Before the introduction of this automated service, one of the
>telephone companies offered the service with real human operators.
>Today the complaint rate from customers is no higher than it was
>when the service was offered with real operators.
>
>This is not to say that use of synthetic speech is always acceptable.
>In fact, many applications for synthetic speech are not adopted
>becasue the speech sounds too robotic.

Margaret Jackman wrote:

>My experience with synthetic vocies is with our telephone information
>system. It asks what is the name and address of the person for whom
>we want the phone number. I am always annoyed since I know I will
>usually have to repeat it to a real person later.
>
>I am also annoyed with voice mail systems that go on forever - giving
>me 10 different options, instead of the voice operator who puts me
>through to the person I want.
>
>I suppose the problem isn't the synthetic language - it is generally
>very clear and concise. The problem is that when I get one it
>generally wastes my time, and for that reason, I have a negative
>reaction to them..

Randall A. Major wrote:

>I'm not sure if they've worked on reactions or not, but you should try
>contacting Barbara Grosz at
>grosz@eecs.harvard.edu
>They've done a lot of work on synthetic speech and she may be able to
>help you. Good luck!

I contacted Barbara Grosz, who wrote:

>sorry, but I have not done any experiments of this sort, though I have
>done some work on speech synthesis. My colleague, Julia Hirschberg,
>at AT&T research may know of some research in this arena, though
>I don't believe she has done any either.

I haven't contacted Julia Hirschberg yet, but I intend to.

Corey Miller wrote:

>You may want to look at an article on the perception of synthetic
>speech by David Pisoni, in Progress in Speech Synthesis,
>van Santen, Sproat, Olive and Hirschberg, Springer, 1997.

I've tried to get a copy of the article through our university
library, but the book is too recent, and I was told no copies are
available yet.

Johanna Rubba wrote:

>My personal reaction to a synthetic voice on the phone is negative. I
>experience
>offense (because the company involved does not care enough to have a
>real person staffing the phone line; they'd rather downsize and replace
>people with machines); irritation (because I am not going to be able to
>get any questions answered, and am going to be obliged to follow the
>inflexible program set down by the corporation [and these are inevitably
>not well-desgined, they waste the customer's time]. I also experience
>irritation because synthetic voices do not sound like real voices,
>meaning I have to put forth extra effort to parse their output, and also
>because I am a perfectionist and don't understand why even relatively
>simple things like normal list intonation (not the weird system used on
>the [non-synthetic, just pre-recorded] directory assistance systems)
>can't be gotten right.
>
>I know enough about computational linguistics to know that achieving
>real-sounding synthetic speech is extremely difficulty, esp. if context
>has to be taken into account. Is this an excuse for ugly synthetic
>speech? Only if you think we really need synthetic speech. Do we?
>
>Oh, it's not all negative -- I do experience a low level of curiosity and
>amusement in hearing how much of the sound of real speech the designers
>have managed to capture in the artificial speech, and the particular
>distortions that are found in synthetic speech (my intro ling students
>love it when I mimic synthetic speech for them and point out things like
>stress and intonation. I think some progress has been made in this area,
>but they sure do recognize that flat, syllable-timed, nasal voice!)
>
>I just thought of a good use of synthetic speech that I do like. My word
>processor has an auditory editor that reads my texts back to me. Though
>the speech has some flaws, it's not too terribly bad, and it is a very
>useful function when the eyes are no longer capable of seeing the errors.
>Note that I like this because it's not an interaction; I get to choose
>when I use it, and I don't expect to have a conversation with it.

and finally,
Stephen S. Spackman wrote:

>Myself, I *like* machines. I use bank machines instead of live tellers
>whenever practical. But (and this doesn't all bear directly on your
>query, but maybe I'm talking to someone who wants to listen...!):
>
>(1) No deception. A machine should announce itself as such - ideally by
>going "boing" or something before it starts to talk. It's extremely
>annoying to find yourself trying to talk *with* a machine thinking it is
>human. When you find out otherwise you feel both stupid and annoyed at
>your wasted effort. Even answering machine messages have this problem.
>
>(2) Machines are not excused from clearing their throats and saying
>hello. Again, "boing" will do and may even be preferable to "ahem" as
>just mentioned. But I once nearly died of fright when a computer behind
>me in a darkened room in a deserted bulding at 3am suddenly said "your
>printer is out of paper." in an extremely calm, pleasant voice but with
>inadequate warning.
>
>(3) Machines are not excused from boundary markers. One of the things I
>*loathe* about automated directory assistance systems and talking clocks
>is that they use the SAME recorded digits in all positions. This makes
>it extremely hard to copy numbers down and know that you have them
>right, as well as being simply annoying. Even just having separate
>final/nonfinal digits would be an improvement. This is actually *less*
>of a problem with synthesised speech, partly because synthesis systems
>are more likely to do contour, and partly because they sound uniformly
>bad rather than atrociously edited!
>
>(4) Machines are not excused from rephrasing. A computer reading phone
>numbers should say, "seven two _six_, one _three_ zero _three_", but if
>asked to repeat itself should use "seven twenty-six, thirteen oh three".
>
>(5) Speech *recognition* systems, at present, fail *consistently* for
>some speakers. The statistics on successfully completed transactions may
>be looking great, while some customers are effectively faced with
>termination of service!
>
>What's specifically wrong with synthetic speech? Total absence of
>pragmatic markers at every level, poor pitch contours, lack of
>interactive adaptation with interlocutor at every level, poorly modelled
>interaction between adjacent segments (which decreases noise immunity
>rather than increasing it, no matter what one's engineering intuitions
>might say :-).

Thanks again to all respondents!

Bente

#########################################################################
Bente Henrikka Moxness
Research Assistant
Dept. of Linguistics
NTNU (Norwegian University of Science and Technology)
7055 Dragvoll
Norway
Tel: +47 73 59 15 16
Fax: +47 73 59 61 19
e-mail: benmox@alfa.itea.ntnu.no
#########################################################################

LL Issue: 8.824
Date Posted: 03-Jun-1997
Original Query: Read original query


Back

Sums main page