* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 19.1853

Wed Jun 11 2008

Qs: Hierarchy of Variation in Natural Speech

Editor for this issue: Catherine Adams <catherinlinguistlist.org>

We'd like to remind readers that the responses to queries are usually best posted to the individual asking the question. That individual is then strongly encouraged to post a summary to the list. This policy was instituted to help control the huge volume of mail on LINGUIST; so we would appreciate your cooperating with it whenever it seems appropriate.

In addition to posting a summary, we'd like to remind people that it is usually a good idea to personally thank those individuals who have taken the trouble to respond to the query.

To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
        1.    Cassie Mayo, Hierarchy of Hierarchy in Natural Speech

Message 1: Hierarchy of Hierarchy in Natural Speech
Date: 11-Jun-2008
From: Cassie Mayo <catherinling.ed.ac.uk>
Subject: Hierarchy of Hierarchy in Natural Speech
E-mail this message to a friend

I'm working on a project looking at the process of subjective evaluation of
speech synthesis (that is, we're not evaluating, but rather determining
what listeners do when they evaluate). We have found (probably
unsurprisingly) that the acoustic information that listeners are influenced
by in judging something like ''naturalness'' of synthetic speech falls into
a hierarchy -- listeners are more influenced by some sorts of information
than others. In very general terms, the hierarchy seems to be: Presence of
artifacts (due to join discontinuities, etc) has more influence than
segmental quality which has more influence than intonation appropriateness.

Intuitively, this looks to me like the opposite of what would be considered
to be acceptable variation in natural speech, that is, listeners will
accept a great deal of variation in intonation, somewhat less variation in
segmental quality,
and much less (no?) variation in terms of presence of artifacts (pops and
clicks, rather than repairs and restarts).

Has anyone come across any references that might support this intuition?

Linguistic Field(s): Computational Linguistics
                            Forensic Linguistics

Read more issues|LINGUIST home page|Top of issue

Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.