Skip to Main content Skip to Navigation
Conference papers

Optimizing Spoken Dialogue Management with Fitted Value Iteration

Abstract : In recent years machine learning approaches have been proposed for dialogue management optimization in spoken dialogue systems. It is customary to cast the dialogue management problem into a Markov Decision Process and to find the optimal policy using Reinforcement Learning (RL) algorithms. Yet, the dialogue state space is large and standard RL algorithms fail to handle it. In this paper we explore the possibility of using a generalization framework for dialogue management which is a particular fitted value iteration algorithm (namely fitted-Q iteration). We show that fitted-Q, when applied to continuous state space dialogue management problems, can generalize well and makes efficient use of samples to learn the approximate optimal state-action value function. Our experimental results show that fitted-Q performs significantly better than the hand-coded policy and relatively better than the policy learned using least-square policy iteration, another generalization algorithm.
Document type :
Conference papers
Complete list of metadata
Contributor : Sébastien van Luchene Connect in order to contact the contributor
Submitted on : Thursday, January 6, 2011 - 4:40:47 PM
Last modification on : Monday, December 14, 2020 - 2:10:02 PM


  • HAL Id : hal-00553184, version 1



Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Optimizing Spoken Dialogue Management with Fitted Value Iteration. Interspeech 2010, Sep 2010, Makuhari, Japan. pp.86-89. ⟨hal-00553184⟩



Les métriques sont temporairement indisponibles