Optimizing Spoken Dialogue Management with Fitted Value Iteration - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2010

Optimizing Spoken Dialogue Management with Fitted Value Iteration

Senthilkumar Chandramohan
  • Fonction : Auteur
  • PersonId : 888330
Matthieu Geist
Olivier Pietquin

Résumé

In recent years machine learning approaches have been proposed for dialogue management optimization in spoken dialogue systems. It is customary to cast the dialogue management problem into a Markov Decision Process and to find the optimal policy using Reinforcement Learning (RL) algorithms. Yet, the dialogue state space is large and standard RL algorithms fail to handle it. In this paper we explore the possibility of using a generalization framework for dialogue management which is a particular fitted value iteration algorithm (namely fitted-Q iteration). We show that fitted-Q, when applied to continuous state space dialogue management problems, can generalize well and makes efficient use of samples to learn the approximate optimal state-action value function. Our experimental results show that fitted-Q performs significantly better than the hand-coded policy and relatively better than the policy learned using least-square policy iteration, another generalization algorithm.
Fichier non déposé

Dates et versions

hal-00553184 , version 1 (06-01-2011)

Identifiants

  • HAL Id : hal-00553184 , version 1

Citer

Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin. Optimizing Spoken Dialogue Management with Fitted Value Iteration. Interspeech 2010, Sep 2010, Makuhari, Japan. pp.86-89. ⟨hal-00553184⟩
35 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More