Neuro-Dynamic Programming, Athena Scientific, 1996. ,
Optimality of Reinforcement Learning Algorithms with Linear Function Approximation, Conference on Neural Information Processing Systems (NIPS 15), 2002. ,
Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, pp.33-57, 1996. ,
Residual Algorithms: Reinforcement Learning with Function Approximation, International Conference on Machine Learning (ICML 95), pp.30-37, 1995. ,
Technical Update: Least-Squares Temporal Difference Learning, Machine Learning, pp.233-246, 1999. ,
A New Approach to Linear Filtering and Prediction Problems, Journal of Basic Engineering, vol.82, issue.1, pp.35-45, 1960. ,
DOI : 10.1115/1.3662552
Algorithms and Representations for Reinforcement Learning, 2005. ,
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning, Discrete Event Dynamic Systems, vol.22, issue.1,2,3, pp.207-239, 2006. ,
DOI : 10.1007/s10626-006-8134-8
Tracking value function dynamics to improve reinforcement learning with piecewise linear function approximation, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007. ,
DOI : 10.1145/1273496.1273591
Unscented Filtering and Nonlinear Estimation, Proceedings of the IEEE, pp.401-422, 2004. ,
DOI : 10.1109/JPROC.2003.823141
Bayesian Reward Filtering, Proceedings of the European Workshop on Reinforcement Learning ser. Lecture Notes in Artificial Intelligence, pp.96-109, 2008. ,
DOI : 10.1007/978-3-540-89722-4_8
URL : https://hal.archives-ouvertes.fr/hal-00351282
Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches, 2006. ,
DOI : 10.1002/0470045345
Neural Networks for Pattern Recognition, 1995. ,
A Sparse Nonlinear Bayesian Online Kernel Regression, 2008 The Second International Conference on Advanced Engineering Computing and Applications in Sciences, pp.199-204, 2008. ,
DOI : 10.1109/ADVCOMP.2008.7
URL : https://hal.archives-ouvertes.fr/hal-00327081
Sigma-Point Kalman Filters for Probabilistic Inference in Dynamic State-Space Models, 2004. ,
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, Machine Learning, pp.89-129, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00830201