J. Baxter and P. Bartlett, Direct gradient-based reinforcement learning, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353), 1999.
DOI : 10.1109/ISCAS.2000.856049

M. Clerc and J. Kennedy, The particle swarm - explosion, stability, and convergence in a multidimensional complex space, IEEE Transactions on Evolutionary Computation, vol.6, issue.1, pp.58-73, 2002.
DOI : 10.1109/4235.985692

A. Engelbrecht, Fundamentals of Computational Swarm Intelligence, 2005.

A. Engelbrecht, Heterogeneous Particle Swarm Optimization, Lecture Notes in Computer Science, vol.6234, pp.191-202, 2010.
DOI : 10.1007/978-3-642-15461-4_17

M. Geist and O. Pietquin, Parametric value function approximation: A unified view, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), p.2011, 2011.
DOI : 10.1109/ADPRL.2011.5967355

URL : https://hal.archives-ouvertes.fr/hal-00618112

J. Kennedy and R. Eberhart, Particle swarm optimization, Proceedings of ICNN'95, International Conference on Neural Networks, pp.1942-1948, 1995.
DOI : 10.1109/ICNN.1995.488968

M. Lagoudakis and R. Parr, Least-squares policy iteration, 2003.

S. Mannor, R. Rubinstein, and Y. Gat, The cross entropy method for fast policy search, International Conference on Machine Learning, p.512, 2003.

R. Munos and A. Moore, Variable resolution discretization for high-accuracy solutions of optimal control problems, In: IJCAI. pp, pp.1348-1355, 1999.

M. W. Spong, The swing up control problem for the Acrobot, IEEE Control Systems Magazine, vol.15, issue.1, pp.49-55, 1995.
DOI : 10.1109/37.341864

R. Sutton and A. Barto, Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998.
DOI : 10.1109/TNN.1998.712192