Learning near-optimal policies with Bellmanresidual minimization based fitted policy iteration and a single sample path, Machine Learning, Différences Temporelles de Kalman : le cas stochastique, pp.89-129, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00830201
Residual Algorithms: Reinforcement Learning with Function Approximation, Proceedings of the International Conference on Machine Learning (ICML 95), pp.30-37, 1995. ,
DOI : 10.1016/B978-1-55860-377-6.50013-X
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.114.5034
Neuro-Dynamic Programming, 1996. ,
DOI : 10.1007/0-306-48332-7_333
Incremental Natural Actor-Critic Algorithms, Proceedings of NIPS 21, 2008. ,
DOI : 10.1016/j.automatica.2009.07.008
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.151.2177
Technical Update : Least-Squares Temporal Difference Learning, Machine Learning, pp.233-246, 1999. ,
Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, pp.33-57, 1996. ,
Algorithms and Representations for Reinforcement Learning, 2005. ,
Reinforcement learning with Gaussian processes, Proceedings of the 22nd international conference on Machine learning , ICML '05, 2005. ,
DOI : 10.1145/1102351.1102377
Différences Temporelles de Kalman, 2009. ,
DOI : 10.3166/ria.24.423-443
Kalman Temporal Differences: The deterministic case, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009. ,
DOI : 10.1109/ADPRL.2009.4927543
URL : https://hal.archives-ouvertes.fr/hal-00380870
Unscented Filtering and Nonlinear Estimation, Proceedings of the IEEE, pp.401-422, 2004. ,
DOI : 10.1109/JPROC.2003.823141
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.136.6539
A New Approach to Linear Filtering and Prediction Problems, Journal of Basic Engineering, vol.82, issue.1, pp.35-45, 1960. ,
DOI : 10.1115/1.3662552
Eligibility Traces for Off-Policy Policy Evaluation, Proceedings of the Seventeenth International Conference on Machine Learning (ICML00), pp.759-766, 2000. ,
Optimal State Estimation : Kalman, H Infinity, and Nonlinear Approaches, 2006. ,
DOI : 10.1002/0470045345
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Sigma-Point Kalman Filters for Probabilistic Inference in Dynamic State-Space Models, 2004. ,
Learning from Delayed Rewards, 1989. ,