W. Eckert, E. Levin, and R. Pieraccini, User modeling for spoken dialogue system evaluation, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.80-87, 1997.
DOI : 10.1109/ASRU.1997.658991

J. Schatzmann, K. Weilhammer, M. Stuttle, and S. Young, A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies, The Knowledge Engineering Review, vol.21, issue.02, pp.97-126, 2006.
DOI : 10.1017/S0269888906000944

O. Pietquin and T. Dutoit, A probabilistic framework for dialog simulation and optimal strategy learning, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.2, pp.589-599, 2006.
DOI : 10.1109/TSA.2005.855836

URL : https://hal.archives-ouvertes.fr/hal-00207952

J. Schatzmann, B. Thomson, K. Weilhammer, H. Ye, and S. Young, Agenda-based user simulation for bootstrapping a POMDP dialogue system, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers on XX, NAACL '07, 2007.
DOI : 10.3115/1614108.1614146

K. Georgila, J. Henderson, and O. Lemon, Learning User Simulations for Information State Update Dialogue Systems, Eurospeech, 2005.

O. Pietquin and R. Beaufort, Comparing ASR Modeling Methods for Spoken Dialogue Simulation and Optimal Strategy Learning, Proc. of Eurospeech'05, pp.861-864, 2005.

O. Pietquin and T. Dutoit, Dynamic Bayesian Networks for NLU Simulation with Applications to Dialog Optimal Strategy Learning, 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, pp.49-52, 2006.
DOI : 10.1109/ICASSP.2006.1659954

O. Lemon and O. Pietquin, Machine learning for spoken dialogue systems, Proc. of InterSpeech'07, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00216035

O. Pietquin and H. Hastie, A survey on metrics for the evaluation of user simulations, The Knowledge Engineering Review, vol.11, issue.01, 2011.
DOI : 10.1016/j.csl.2009.03.002

URL : https://hal.archives-ouvertes.fr/hal-00771654

P. Abbeel and A. Y. Ng, Apprenticeship learning via inverse reinforcement learning, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015430

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.92

A. Y. Ng and S. Russell, Algorithms for inverse reinforcement learning, Proc. of ICML, 2000.

R. Bellman, A Markovian Decision Process, Indiana University Mathematics Journal, vol.6, issue.4, pp.679-684, 1957.
DOI : 10.1512/iumj.1957.6.56038

O. Lemon, K. Georgila, J. Henderson, and M. Stuttle, An ISU dialogue system exhibiting reinforcement learning of dialogue policies, Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations on, EACL '06, 2006.
DOI : 10.3115/1608974.1608986

S. Larsson and D. R. Traum, Information state and dialogue management in the TRINDI dialogue move engine toolkit, Natural Language Engineering, vol.6, issue.3&4, pp.323-340, 2000.
DOI : 10.1017/S1351324900002539

M. G. Lagoudakis and R. Parr, Least-squares policy iteration, Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003.

J. Schatzmann, M. N. Stuttle, K. Weilhammer, and S. Young, Effects of the user model on simulation-based learning of dialogue strategies, Proc. of ASRU'05, 2005.

E. Klein, M. Geist, and O. Pietquin, Batch, Off-Policy and Model-Free Apprenticeship Learning, IJCAI-ALIHT Workshop, 2011.
DOI : 10.1007/978-3-642-29946-9_28

URL : https://hal.archives-ouvertes.fr/hal-00660623