Tracking in Reinforcement Learning

Matthieu Geist; Olivier Pietquin; Gabriel Fricout

doi:10.1007/978-3-642-10677-4_57

Communication Dans Un Congrès Année : 2009

Tracking in Reinforcement Learning

(1, 2) , (2) , (1)

1
2

Matthieu Geist

Fonction : Auteur
PersonId : 6945
IdHAL : matthieu-geist

ArcelorMittal Maizières Research SA

SUPELEC-Campus Metz

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

SUPELEC-Campus Metz

Gabriel Fricout

Fonction : Auteur

ArcelorMittal Maizières Research SA

Résumé

Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the environment of the learning agent can be considered as stationary, generalized policy iteration frameworks, because of the interleaving of learning and control, will produce non-stationarity of the evaluated policy and so of its value function. Tracking the optimal solution instead of trying to converge to it is therefore preferable. In this paper, we propose to handle this tracking issue with a Kalman-based temporal difference framework. Complexity and convergence analysis are studied. Empirical investigations of its ability to handle non-stationarity is finally provided.

Mots clés

Reinforcement learning value function approximation tracking Kalman filtering

Domaines

Apprentissage [cs.LG]

Sébastien Van Luchene : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-00439316

Soumis le : lundi 7 décembre 2009-12:04:28

Dernière modification le : mardi 14 février 2023-03:38:00

Dates et versions

hal-00439316 , version 1 (07-12-2009)

Identifiants

HAL Id : hal-00439316 , version 1
DOI : 10.1007/978-3-642-10677-4_57

Citer

Matthieu Geist, Olivier Pietquin, Gabriel Fricout. Tracking in Reinforcement Learning. 16th International Conference on Neural Information Processing - ICONIP 2009, Dec 2009, Bangkok, Thailand. pp.502-511, ⟨10.1007/978-3-642-10677-4_57⟩. ⟨hal-00439316⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC SUP_IMS CENTRALESUPELEC

41 Consultations

0 Téléchargements

Tracking in Reinforcement Learning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager