A cascaded supervised learning approach to inverse reinforcement learning

Abstract : This paper considers the Inverse Reinforcement Learning (IRL) problem, that is inferring a reward function for which a demonstrated expert policy is optimal. We propose to break the IRL problem down into two generic Supervised Learning steps: this is the Cascaded Supervised IRL (CSI) approach. A classification step that defines a score function is followed by a regression step providing a reward function. A theoretical analysis shows that the demonstrated expert policy is nearoptimal for the computed reward function. Not needing to repeatedly solve a Markov Decision Process (MDP) and the ability to leverage existing techniques for classification and regression are two important advantages of the CSI approach. It is furthermore empirically demonstrated to compare positively to state-of-the-art approaches when using only transitions sampled according to the expert policy, up to the use of some heuristics. This is exemplified on two classical benchmarks (the mountain car problem and a highway driving simulator).
Type de document :
Communication dans un congrès
Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), Sep 2013, Prague, Czech Republic. Springer, Lecture Notes in Computer Science, 8188, pp.1-16, 2013, Machine Learning and Knowledge Discovery in Databases. 〈10.1007/978-3-642-40988-2_1〉
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal-supelec.archives-ouvertes.fr/hal-00869804
Contributeur : Sébastien Van Luchene <>
Soumis le : lundi 6 novembre 2017 - 17:44:27
Dernière modification le : lundi 23 avril 2018 - 14:35:06

Fichier

csi_irl.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. A cascaded supervised learning approach to inverse reinforcement learning. Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD 2013), Sep 2013, Prague, Czech Republic. Springer, Lecture Notes in Computer Science, 8188, pp.1-16, 2013, Machine Learning and Knowledge Discovery in Databases. 〈10.1007/978-3-642-40988-2_1〉. 〈hal-00869804〉

Partager

Métriques

Consultations de la notice

157

Téléchargements de fichiers

13