Skip to Main content Skip to Navigation
Conference papers

Eligibility Traces through Colored Noises

Abstract : The Gaussian Process Temporal Differences (GPTD) framework initiated statistical modeling of value function approximation. It was followed by the close Kalman Temporal Differences (KTD) approach. Both methods share the same drawback: they provide biased estimates of the value function when transitions of the system to be controlled are stochastic. A colored noise model has been introduced to cope with this problem in the GPTD framework, which actually leads to a Monte-Carlo estimate of the value function. In this paper, we generalize this colored noise model using ideas close to eligibility traces and apply it to the KTD framework. This allows removing the bias when the so-called eligibility factor is set to one, and decreasing it when this factor is strictly between zero and one. The proposed algorithm is experimented on the simple Boyan chain in order to study the effect of the eligibility factor. As KTD generalizes GPTD in the sense that it allows taking into account nonlinear parameterizations, we also propose an experiment combining the new algorithm with a neural network.
Document type :
Conference papers
Complete list of metadata
Contributor : Sébastien van Luchene Connect in order to contact the contributor
Submitted on : Monday, January 10, 2011 - 11:10:33 AM
Last modification on : Monday, December 14, 2020 - 2:10:02 PM




Matthieu Geist, Olivier Pietquin. Eligibility Traces through Colored Noises. ICUMT 2010, Oct 2010, Moscow, Russia. pp.458-465, ⟨10.1109/ICUMT.2010.5676597⟩. ⟨hal-00553910⟩



Les métriques sont temporairement indisponibles