Publishing Partner: Cambridge University Press CUP Extra Publisher Login

New from Cambridge University Press!


Voice Quality

By John H. Esling, Scott R. Moisik, Allison Benner, Lise Crevier-Buchman

Voice Quality "The first description of voice quality production in forty years, this book provides a new framework for its study: The Laryngeal Articulator Model. Informed by instrumental examinations of the laryngeal articulatory mechanism, it revises our understanding of articulatory postures to explain the actions, vibrations and resonances generated in the epilarynx and pharynx."

New from Oxford University Press!


Let's Talk

By David Crystal

Let's Talk "Explores the factors that motivate so many different kinds of talk and reveals the rules we use unconsciously, even in the most routine exchanges of everyday conversation."

E-mail this page 1

We Have a New Site!

With the help of your donations we have been making good progress on designing and launching our new website! Check it out at!
***We are still in our beta stages for the new site--if you have any feedback, be sure to let us know at***

Dissertation Information

Title: Bootstrapping Reinforcement Learning-Based Dialogue Strategies from Wizard-of-Oz Data Add Dissertation
Author: Verena Rieser Update Dissertation
Email: click here to access email
Institution: Saarland University, Department of Computational Linguistics and Phonetics
Completed in: 2008
Linguistic Subfield(s): Computational Linguistics;
Director(s): Manfred Pinkal
Oliver Lemon

Abstract: In my PhD thesis, I develop a framework to optimise multimodal dialogue
strategies from small amounts of Wizard-of-Oz (WOZ) data.

Designing a spoken dialogue system can be a time-consuming and challenging
process. To facilitate strategy development, recent research investigates
the use of Reinforcement Learning (RL) methods applied to automatic
dialogue strategy optimisation from real data. For new application domains
where a system is designed from scratch, however, there is often no
suitable in-domain data available, leaving the developer with a classic
chicken-and-egg problem.

This thesis proposes to learn dialogue strategies by simulation-based RL,
where the simulated environment is learned from small amounts of
Wizard-of-Oz data. Using WOZ data rather than data from real Human-Computer
Interaction allows us to learn optimal strategies for new application areas
beyond the scope of existing dialogue systems. Optimised learned strategies
are then available from the first moment of online-operation, and tedious
handcrafting of dialogue strategies is fully omitted. We call this method

Our results show that a dialogue policy constructed using this framework
significantly outperforms a non-optimised data-driven policy (constructed
via Supervised Learning) in in terms of subjective user ratings and
objective dialogue performance measures. For example, RL leads to an almost
50% increase in perceived Task Ease and almost 20% increase in Future Use.

The technical contributions of this thesis are new methods and techniques
introduced to learn a simulated learning environment from small amounts of
WOZ data. For example, a new method to learn and evaluate user simulations,
and non-linear reward functions are introduced. The overall contribution is
an end-to-end data-driven framework to design and evaluate RL-based
dialogue strategies - from data collection to user testing.