A Non-Parametric Approach to Approximate Dynamic Programming

Abstract : Approximate Dynamic Programming (ADP) is a machine learning method aiming at learning an optimal control policy for a dynamic and stochastic system from a logged set of observed interactions between the system and one or several non-optimal controlers. It defines a class of particular Reinforcement Learning (RL) algorithms which is a general paradigm for learning such a control policy from interactions. ADP addresses the problem of systems exhibiting a state space which is too large to be enumerated in the memory of a computer. Because of this, approximation schemes are used to generalize estimates over continuous state spaces. Nevertheless, RL still suffers from a lack of scalability to multidimensional continuous state spaces. In this paper, we propose the use of the Locally Weighted Projection Regression (LWPR) method to handle this scalability problem. We prove the efficacy of our approach on two standard benchmarks modified to exhibit larger state spaces.
Document type :
Conference papers
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

https://hal-supelec.archives-ouvertes.fr/hal-00652438
Contributor : Sébastien van Luchene <>
Submitted on : Thursday, December 15, 2011 - 3:51:30 PM
Last modification on : Wednesday, July 31, 2019 - 4:18:03 PM
Long-term archiving on : Monday, December 5, 2016 - 8:45:38 AM

File

ICMLA_2011_HGFAMGOP.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00652438, version 1

Collections

Citation

Hadrien Glaude, Fadi Akrimi, Matthieu Geist, Olivier Pietquin. A Non-Parametric Approach to Approximate Dynamic Programming. ICMLA 2011, Dec 2011, Honolulu, Hawaii, United States. pp.1-6. ⟨hal-00652438⟩

Share

Metrics

Record views

440

Files downloads

195