Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, Machine Learning, pp.89-129, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00830201
Residual Algorithms: Reinforcement Learning with Function Approximation, Proceedings of the International Conference on Machine Learning (ICML 95), pp.30-37, 1995. ,
DOI : 10.1016/B978-1-55860-377-6.50013-X
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.114.5034
Neuro-Dynamic Programming, 1996. ,
DOI : 10.1007/0-306-48332-7_333
Incremental Natural Actor-Critic Algorithms, Proceedings of the Twenty-First Annual Conference on Advances in Neural Information Processing Systems (NIPS), 2008. ,
DOI : 10.1016/j.automatica.2009.07.008
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.151.2177
Neural Networks for Pattern Recognition, 1995. ,
Technical Update : Least-Squares Temporal Difference Learning, Machine Learning, pp.233-246, 1999. ,
Linear Least-Squares Algorithms for Temporal Difference Learning, Machine Learning, pp.33-57, 1996. ,
A Generalized Kalman Filter for Fixed Point Approximation and Efficient Temporal-Difference Learning. Discrete Event Dynamic Systems, pp.207-239, 2006. ,
Bayesian q-learning, AAAI/IAAI, pp.761-768, 1998. ,
Algorithms and Representations for Reinforcement Learning, 2005. ,
Reinforcement learning with Gaussian processes, Proceedings of the 22nd international conference on Machine learning , ICML '05, 2005. ,
DOI : 10.1145/1102351.1102377
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.75.5939
Bayesian Reward Filtering, Proceedings of the European Workshop on Reinforcement Learning, pp.96-109, 2008. ,
DOI : 10.1007/978-3-540-89722-4_8
URL : https://hal.archives-ouvertes.fr/hal-00351282
Kalman Temporal Differences : Uncertainty and Value Function Approximation, NIPS Workshop on Model Uncertainty and Risk in Reinforcement Learning, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00351298
Différences temporelles de kalman : le cas stochastique, actes des Journées Francophones de Planification, Décision et Apprentissage pour la conduite de systèmes, 2009. ,
Kalman Temporal Differences: The deterministic case, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009. ,
DOI : 10.1109/ADPRL.2009.4927543
URL : https://hal.archives-ouvertes.fr/hal-00380870
Consistent Normalized Least Mean Square Filtering with Noisy Data Matrix, IEEE Transactions on Signal Processing, vol.53, issue.6, pp.2112-2123, 2005. ,
Unscented Filtering and Nonlinear Estimation, Proceedings of the IEEE, pp.401-422, 2004. ,
DOI : 10.1109/JPROC.2003.823141
A New Approach to Linear Filtering and Prediction Problems, Journal of Basic Engineering, vol.82, issue.1, pp.35-45, 1960. ,
DOI : 10.1115/1.3662552
Least-Squares Policy Iteration, Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003. ,
Tracking value function dynamics to improve reinforcement learning with piecewise linear function approximation, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007. ,
DOI : 10.1145/1273496.1273591
Optimality of Reinforcement Learning Algorithms with Linear Function Approximation, Conference on Neural Information Processing Systems (NIPS 15), 2002. ,
Processus décisionnels de Markov en intelligence artificielle, 2008. ,
Optimal State Estimation : Kalman, H Infinity, and Nonlinear Approaches, 2006. ,
DOI : 10.1002/0470045345
PAC model-free reinforcement learning, Proceedings of the 23rd international conference on Machine learning , ICML '06, pp.881-888, 2006. ,
DOI : 10.1145/1143844.1143955
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Instrumental variable methods for system identification, Circuits, Systems, and Signal Processing, vol.57, pp.1-9, 2002. ,
DOI : 10.1007/BFb0009019
Sigma-Point Kalman Filters for Probabilistic Inference in Dynamic State-Space Models, Proceedings of the Workshop on Advances in Machine Learning, 2003. ,