P. Abbeel and A. Y. Ng, Apprenticeship learning via inverse reinforcement learning, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015430

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

R. Bellman, A Markovian Decision Process, Indiana University Mathematics Journal, vol.6, issue.4, pp.679-684, 1957.
DOI : 10.1512/iumj.1957.6.56038

W. Eckert, E. Levin, and R. Pieraccini, User modeling for spoken dialogue system evaluation, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.80-87, 1997.
DOI : 10.1109/ASRU.1997.658991

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

K. Georgila, J. Henderson, and O. Lemon, Learning User Simulations for Information State Update Dialogue Systems, Eurospeech, 2005.

M. G. Lagoudakis and R. Parr, Least-squares policy iteration, Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003.

S. Larsson and D. R. Traum, Information state and dialogue management in the TRINDI dialogue move engine toolkit, Natural Language Engineering, vol.6, issue.3&4, pp.323-340, 2000.
DOI : 10.1017/S1351324900002539

O. Lemon, K. Georgila, J. Henderson, and M. Stuttle, An ISU dialogue system exhibiting reinforcement learning of dialogue policies, Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations on, EACL '06, 2006.
DOI : 10.3115/1608974.1608986

O. Lemon and O. Pietquin, Machine learning for spoken dialogue systems, Proc. of InterSpeech'07, 2007.
URL : https://hal.archives-ouvertes.fr/hal-00216035

A. Y. Ng and S. Russell, Algorithms for inverse reinforcement learning, Proc. of ICML, pp.663-670, 2000.

O. Pietquin and T. Dutoit, A probabilistic framework for dialog simulation and optimal strategy learning, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.2, pp.589-599, 2006.
DOI : 10.1109/TSA.2005.855836

URL : https://hal.archives-ouvertes.fr/hal-00207952

O. Pietquin and H. Hastie, A survey on metrics for the evaluation of user simulations, The Knowledge Engineering Review, vol.11, issue.01, 2011.
DOI : 10.1016/j.csl.2009.03.002

URL : https://hal.archives-ouvertes.fr/hal-00771654

J. Schatzmann, M. N. Stuttle, K. Weilhammer, and S. Young, Effects of the user model on simulation-based learning of dialogue strategies, Proc. of ASRU'05, 2005.

J. Schatzmann, B. Thomson, K. Weilhammer, H. Ye, and . S. Young, Agenda-based user simulation for bootstrapping a POMDP dialogue system, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers on XX, NAACL '07, 2007.
DOI : 10.3115/1614108.1614146

J. Schatzmann, K. Weilhammer, M. Stuttle, and S. Young, A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies, The Knowledge Engineering Review, vol.21, issue.02, pp.97-126, 2006.
DOI : 10.1017/S0269888906000944

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192