Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system

Abstract : The optimization of dialogue policies using reinforcement learning (RL) is now an accepted part of the state of the art in spoken dialogue systems (SDS). Yet, it is still the case that the commonly used training algorithms for SDS require a large number of dialogues and hence most systems still rely on artificial data generated by a user simulator. Optimization is therefore performed off-line before releasing the system to real users. Gaussian Processes (GP) for RL have recently been applied to dialogue systems. One advantage of GP is that they compute an explicit measure of uncertainty in the value function estimates computed during learning. In this paper, a class of novel learning strategies is described which use uncertainty to control exploration on-line. Comparisons between several exploration schemes show that significant improvements to learning speed can be obtained and that rapid and safe online optimisation is possible, even on a complex task.
Document type :
Conference papers
Complete list of metadatas

Cited literature [19 references]  Display  Hide  Download

https://hal-supelec.archives-ouvertes.fr/hal-00652194
Contributor : Sébastien van Luchene <>
Submitted on : Thursday, December 15, 2011 - 9:52:53 AM
Last modification on : Wednesday, July 31, 2019 - 4:18:03 PM
Long-term archiving on : Friday, November 16, 2012 - 3:36:21 PM

File

IS_2011_LDMGSCMGOPSY.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00652194, version 1

Collections

Citation

Lucie Daubigney, Milica Gašić, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin, et al.. Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system. Interspeech 2011, Aug 2011, Florence, Italy. pp.1301-1304. ⟨hal-00652194⟩

Share

Metrics

Record views

297

Files downloads

158