Apprenticeship learning via inverse reinforcement learning, Twenty-first international conference on Machine learning , ICML '04, 2004. ,
DOI : 10.1145/1015330.1015430
Learning near-optimal policies with bellmanresidual minimization based fitted policy iteration and a single sample path, Machine Learning, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00830201
On the generation of markov decision processes, Journal of the Operational Research Society, 1995. ,
Theory of reproducing kernels, Transactions of the American Mathematical Society, vol.68, issue.3, 1950. ,
DOI : 10.1090/S0002-9947-1950-0051437-7
Dynamic programming and optimal control, Athena Scientific, vol.1, 1995. ,
Linear least-squares algorithms for temporal difference learning, Machine Learning, 1996. ,
Classification and regression trees, 1993. ,
Generalized gradients and applications. Transactions of the, 1975. ,
DOI : 10.1090/s0002-9947-1975-0367131-6
Error propagation for approximate policy and value iteration, Proc. of NIPS, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00830154
Generalized boosting algorithms for convex optimization, Proc. of ICML, 2011. ,
Active imitation learning via reduction to iid active learning, Proc. of UAI, 2012. ,
Learning from limited demonstrations, Proc. of NIPS, 2013. ,
Inverse reinforcement learning through structured classification, Proc. of NIPS, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00778624
Least-squares policy iteration, Journal of Machine Learning Research, 2003. ,
Modelling transition dynamics in mdps with rkhs embeddings, Proc. of ICML, 2012. ,
Performance Bounds in $L_p$???norm for Approximate Value Iteration, SIAM Journal on Control and Optimization, vol.46, issue.2, 2007. ,
DOI : 10.1137/040614384
Learning from Demonstrations: Is It Worth Estimating a Reward Function?, Proc. of ECML, 2013. ,
DOI : 10.1007/978-3-642-40988-2_2
URL : https://hal.archives-ouvertes.fr/hal-00916938
Markov decision processes: Discrete stochastic dynamic programming, 1994. ,
DOI : 10.1002/9780470316887
Imitation learning for locomotion and manipulation, 2007 7th IEEE-RAS International Conference on Humanoid Robots, 2007. ,
DOI : 10.1109/ICHR.2007.4813899
Maximum margin planning, Proceedings of the 23rd international conference on Machine learning , ICML '06, 2006. ,
DOI : 10.1145/1143844.1143936
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.4056
A reduction of imitation learning and structured prediction to no-regret online learning, Proc. of AISTATS, 2011. ,
Minimization methods for non-differentiable functions, 1985. ,
DOI : 10.1007/978-3-642-82118-9
Hilbert space embeddings and metrics on probability measures, The Journal of Machine Learning Research, 2010. ,
Apprenticeship learning using linear programming, Proceedings of the 25th international conference on Machine learning, ICML '08, 2008. ,
DOI : 10.1145/1390156.1390286
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.4348
Rates of convergence for empirical processes of stationary mixing sequences. The Annals of Probability, 1994. ,