A survey on metrics for the evaluation of user simulations

Olivier Pietquin; Helen Hastie

doi:10.1017/S0269888912000343

Article Dans Une Revue Knowledge Engineering Review Année : 2013

A survey on metrics for the evaluation of user simulations

(1) , (2)

1
2

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

IMS : Information, Multimodalité & Signal

Helen Hastie

Fonction : Auteur

School of Mathematical and Computer Sciences

Résumé

User simulation is an important research area in the field of spoken dialogue systems (SDSs) because collecting and annotating real human-machine interactions is often expensive and time-consuming. However, such data are generally required for designing, training and assessing dialogue systems. User simulations are especially needed when using machine learning methods for optimizing dialogue management strategies such as Reinforcement Learning, where the amount of data necessary for training is larger than existing corpora. The quality of the user simulation is therefore of crucial importance because it dramatically influences the results in terms of SDS performance analysis and the learnt strategy. Assessment of the quality of simulated dialogues and user simulation methods is an open issue and, although assessment metrics are required, there is no commonly adopted metric. In this paper, we give a survey of User Simulations Metrics in the literature, propose some extensions and discuss these metrics in terms of a list of desired features.

Sébastien Van Luchene : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-00771654

Soumis le : mercredi 9 janvier 2013-10:28:50

Dernière modification le : jeudi 7 mars 2024-12:32:05

Dates et versions

hal-00771654 , version 1 (09-01-2013)

Identifiants

HAL Id : hal-00771654 , version 1
DOI : 10.1017/S0269888912000343

Citer

Olivier Pietquin, Helen Hastie. A survey on metrics for the evaluation of user simulations. Knowledge Engineering Review, 2013, 28 (1), pp.59-73. ⟨10.1017/S0269888912000343⟩. ⟨hal-00771654⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC SUP_IMS CENTRALESUPELEC

52 Consultations

0 Téléchargements

A survey on metrics for the evaluation of user simulations

Résumé

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager