* * * * * * * * * * * * * * * * * * * * * * * *
LINGUIST List logo Eastern Michigan University Wayne State University *
* People & Organizations * Jobs * Calls & Conferences * Publications * Language Resources * Text & Computer Tools * Teaching & Learning * Mailing Lists * Search *
* *
LINGUIST List 19.2459

Fri Aug 08 2008

Diss: Comp Ling: Rieser: 'Bootstrapping Reinforcement ...'

Editor for this issue: Evelyn Richter <evelynlinguistlist.org>

To post to LINGUIST, use our convenient web form at http://linguistlist.org/LL/posttolinguist.html.
        1.    Verena Rieser, Bootstrapping Reinforcement Learning-Based Dialogue Strategies from Wizard-of-Oz Data

Message 1: Bootstrapping Reinforcement Learning-Based Dialogue Strategies from Wizard-of-Oz Data
Date: 08-Aug-2008
From: Verena Rieser <vrieserinf.ed.ac.uk>
Subject: Bootstrapping Reinforcement Learning-Based Dialogue Strategies from Wizard-of-Oz Data
E-mail this message to a friend

Institution: Saarland University
Program: Department of Computational Linguistics and Phonetics
Dissertation Status: Completed
Degree Date: 2008

Author: Verena Rieser

Dissertation Title: Bootstrapping Reinforcement Learning-Based Dialogue Strategies from Wizard-of-Oz Data

Dissertation URL: http://homepages.inf.ed.ac.uk/vrieser/thesis.html

Linguistic Field(s): Computational Linguistics

Dissertation Director:
Oliver Lemon
Manfred Pinkal

Dissertation Abstract:

In my PhD thesis, I develop a framework to optimise multimodal dialogue
strategies from small amounts of Wizard-of-Oz (WOZ) data.

Designing a spoken dialogue system can be a time-consuming and challenging
process. To facilitate strategy development, recent research investigates
the use of Reinforcement Learning (RL) methods applied to automatic
dialogue strategy optimisation from real data. For new application domains
where a system is designed from scratch, however, there is often no
suitable in-domain data available, leaving the developer with a classic
chicken-and-egg problem.

This thesis proposes to learn dialogue strategies by simulation-based RL,
where the simulated environment is learned from small amounts of
Wizard-of-Oz data. Using WOZ data rather than data from real Human-Computer
Interaction allows us to learn optimal strategies for new application areas
beyond the scope of existing dialogue systems. Optimised learned strategies
are then available from the first moment of online-operation, and tedious
handcrafting of dialogue strategies is fully omitted. We call this method

Our results show that a dialogue policy constructed using this framework
significantly outperforms a non-optimised data-driven policy (constructed
via Supervised Learning) in in terms of subjective user ratings and
objective dialogue performance measures. For example, RL leads to an almost
50% increase in perceived Task Ease and almost 20% increase in Future Use.

The technical contributions of this thesis are new methods and techniques
introduced to learn a simulated learning environment from small amounts of
WOZ data. For example, a new method to learn and evaluate user simulations,
and non-linear reward functions are introduced. The overall contribution is
an end-to-end data-driven framework to design and evaluate RL-based
dialogue strategies - from data collection to user testing.

Read more issues|LINGUIST home page|Top of issue

Please report any bad links or misclassified data

LINGUIST Homepage | Read LINGUIST | Contact us

NSF Logo

While the LINGUIST List makes every effort to ensure the linguistic relevance of sites listed
on its pages, it cannot vouch for their contents.