S. Larsson and D. R. Traum, Information state and dialogue management in the TRINDI dialogue move engine toolkit, Natural language engineering, pp.323-340, 2000.
DOI : 10.1017/S1351324900002539

O. Pietquin and T. Dutoit, Aided design of finite-state dialogue management systems, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698), pp.545-548, 2003.
DOI : 10.1109/ICME.2003.1221369

R. Freedman, Atlas: A plan manager for mixed-initiative, multimodal dialogue, Proceedings of the AAAI-99 Workshop on Mixed-Initiative Intelligence. Citeseer, pp.1-8, 1999.

R. Bellman, A Markovian Decision Process, Indiana University Mathematics Journal, vol.6, issue.4, pp.679-684, 1957.
DOI : 10.1512/iumj.1957.6.56038

C. Boutilier, T. Dean, and S. Hanks, Decision-theoretic planning: Structural assumptions and computational leverage, Journal of Artificial Intelligence Research, vol.11, pp.1-94, 1999.

E. Levin, R. Pieraccini, and W. Eckert, Learning dialogue strategies within the Markov decision process framework, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings, pp.72-79, 1997.
DOI : 10.1109/ASRU.1997.658989

M. Walker, D. Litman, C. Kamm, and A. Abella, Paradise: A framework for evaluating spoken dialogue agents, Proceedings of Annual Meeting of the Association for Computational Linguistics, pp.271-280, 1997.

R. Sutton and A. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192

S. Singh, M. Kearns, D. Litman, and M. Walker, Reinforcement learning for spoken dialogue systems, Proceedings of the Conference of the Neural Information Processing Systems Foundation (NIPS), 1999.

E. Levin, R. Pieraccini, and W. Eckert, A stochastic model of human-machine interaction for learning dialog strategies, IEEE Transactions on Speech and Audio Processing, vol.8, issue.1, pp.11-23, 2000.
DOI : 10.1109/89.817450

O. Pietquin and T. Dutoit, A probabilistic framework for dialog simulation and optimal strategy learning, IEEE Transactions on Audio, Speech and Language Processing, vol.14, issue.2, pp.589-599, 2006.
DOI : 10.1109/TSA.2005.855836
URL : https://hal.archives-ouvertes.fr/hal-00207952

S. Young, M. Gasic, S. Keizer, F. Mairesse, J. Schatzmann et al., The Hidden Information State model: A practical framework for POMDP-based spoken dialogue management, Computer Speech & Language, vol.24, issue.2, pp.150-174, 2010.
DOI : 10.1016/j.csl.2009.04.001
URL : https://hal.archives-ouvertes.fr/hal-00598186

N. Roy, J. Pineau, and S. Thrun, Spoken dialogue management using probabilistic reasoning, Proceedings of the 38th Annual Meeting on Association for Computational Linguistics , ACL '00, pp.93-100, 2000.
DOI : 10.3115/1075218.1075231
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.8204

V. Heidrich-meisner and C. Igel, Evolution Strategies for Direct Policy Search, Parallel Problem Solving from Nature?PPSN X, pp.428-437, 2008.
DOI : 10.1007/978-3-540-87700-4_43

S. Mannor, R. Rubinstein, and Y. Gat, The cross entropy method for fast policy search, Proceedings of the International Conference on Machine Learning (ICML), pp.512-519, 2003.

L. Busoniu, D. Ernst, B. D. Schutter, and R. Babuska, Cross-Entropy Optimization of Control Policies With Adaptive Basis Functions, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol.41, issue.1, pp.196-209, 2011.
DOI : 10.1109/TSMCB.2010.2050586

A. P. Engelbrecht, Fundamentals of computational swarm intelligence, 2005.

J. Kennedy and R. Eberhart, Particle swarm optimization, Proceedings of ICNN'95, International Conference on Neural Networks, pp.1942-1948, 1995.
DOI : 10.1109/ICNN.1995.488968

M. Ga?i´ga?i´c, F. Jur?í?ek, S. Keizer, F. Mairesse, B. Thomson et al., Gaussian processes for fast policy optimisation of POMDP-based dialogue managers, Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGdial), pp.201-204, 2010.

L. Daubigney, M. Ga?i´ga?i´c, S. Chandramohan, M. Geist, O. Pietquin et al., Uncertainty management for on-line optimisation of a POMDP-based large-scale spoken dialogue system, Proceedings of InterSpeech, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00652194

L. Daubigney, M. Geist, S. Chandramohan, and O. Pietquin, A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimization, IEEE Journal of Selected Topics in Signal Processing, vol.6, issue.8, pp.891-902, 2012.
DOI : 10.1109/JSTSP.2012.2229257

J. Fix and M. Geist, Monte-Carlo Swarm Policy Search, " in Symposium on Swarm Intelligence and Differential Evolution (SIDE), ser. Lecture Notes in Artificial Intelligence (LNAI), 2012.

M. Clerc, Standard particle swarm optimisation from, 2006.
DOI : 10.4018/978-1-4666-1592-2.ch001

J. Schatzmann, B. Thomson, K. Weilhammer, H. Ye, and S. Young, Agenda-based user simulation for bootstrapping a POMDP dialogue system, Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers on XX, NAACL '07, pp.149-152, 2007.
DOI : 10.3115/1614108.1614146

M. Ga?i´ga?i´c, F. Jur?í?ek, S. Keizer, F. Mairesse, B. Thomson et al., Gaussian processes for fast policy optimisation of POMDP-based dialogue managers, Proceedings of the Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGdial). Association for Computational Linguistics, pp.201-204, 2010.