P. Abbeel and A. Ng, Apprenticeship learning via inverse reinforcement learning, Twenty-first international conference on Machine learning , ICML '04, 2004.
DOI : 10.1145/1015330.1015430

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

S. Bradtke and A. Barto, Linear least-squares algorithms for temporal difference learning, Machine Learning, pp.33-57, 1996.

J. Kolter, P. Abbeel, and A. Ng, Hierarchical apprenticeship learning with application to quadruped locomotion, Neural information processing systems, 2008.

M. Lagoudakis and R. Parr, Least-squares policy iteration, The Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003.

A. Lazaric, M. Ghavamzadeh, and R. Munos, Finite-sample analysis of lstd, Proceedings of the 27th International Conference on Machine Learning, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00482189

A. Nedi´cnedi´c and D. Bertsekas, Least squares policy evaluation algorithms with linear function approximation. Discrete Event Dynamic Systems, pp.79-110, 2003.

G. Neu and C. Szepesvári, Apprenticeship learning using inverse reinforcement learning and gradient methods, Proc. UAI, pp.295-302, 2007.

A. Ng and S. Russell, Algorithms for inverse reinforcement learning, Proceedings of the Seventeenth International Conference on Machine Learning, pp.663-670, 2000.

M. Puterman, Markov decision processes : Discrete stochastic dynamic programming, 1994.
DOI : 10.1002/9780470316887

D. Ramachandran and E. Amir, Bayesian inverse reinforcement learning, p.61801, 2007.

N. Ratliff, J. Bagnell, and S. Srinivasa, Imitation learning for locomotion and manipulation, 2007 7th IEEE-RAS International Conference on Humanoid Robots, pp.392-397, 2007.
DOI : 10.1109/ICHR.2007.4813899

N. Ratliff, J. Bagnell, and M. Zinkevich, Maximum margin planning, Proceedings of the 23rd international conference on Machine learning , ICML '06, p.736, 2006.
DOI : 10.1145/1143844.1143936

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

N. Ratliff, D. Bradley, J. Bagnell, and J. Chestnutt, Boosting structured prediction for imitation learning, Advances in Neural Information Processing Systems, vol.19, p.1153, 2007.

S. Russell, Learning agents for uncertain environments (extended abstract), Proceedings of the eleventh annual conference on Computational learning theory , COLT' 98, p.103, 1998.
DOI : 10.1145/279943.279964

R. Sutton and A. Barto, Reinforcement learning, 1998.
DOI : 10.1007/978-1-4615-3618-5

URL : https://hal.archives-ouvertes.fr/hal-00764281

U. Syed, M. Bowling, and R. Schapire, Apprenticeship learning using linear programming, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.1032-1039, 2008.
DOI : 10.1145/1390156.1390286

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

U. Syed and R. Schapire, A game-theoretic approach to apprenticeship learning Advances in neural information processing systems, pp.1449-1456, 2008.

B. Ziebart, A. Maas, J. Bagnell, and A. Dey, Maximum entropy inverse reinforcement learning, Proc. AAAI, pp.1433-1438, 2008.