Structured Classification for Inverse Reinforcement Learning

Edouard Klein; Bilal Piot; Matthieu Geist; Olivier Pietquin

Communication Dans Un Congrès Année : 2012

Structured Classification for Inverse Reinforcement Learning

(1) , (1) , (1) , (1)

Edouard Klein

Fonction : Auteur
PersonId : 901877

IMS : Information, Multimodalité & Signal

Bilal Piot

Fonction : Auteur

IMS : Information, Multimodalité & Signal

Matthieu Geist

Fonction : Auteur
PersonId : 6945
IdHAL : matthieu-geist

IMS : Information, Multimodalité & Signal

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

IMS : Information, Multimodalité & Signal

Résumé

This paper addresses the Inverse Reinforcement Learning (IRL) problem which is a particular case of learning from demonstrations. The IRL framework assumes that an expert, demonstrating a task, is acting optimally with respect to an unknown reward function to be discovered. Unlike most of existing IRL algorithms, the proposed approach doesn't require any of the following: complete trajectories from the expert, a generative model of the environment, the knowledge of the transition probabilities, the ability to repeatedly solve the forward Reinforcement Learning (RL) problem, the expert's policy anywhere in the state space. Using a classi cation approach in which the structure of the underlying Markov Decision Process (MDP) is implicitly injected, we end-up with an e cient subgradient descent-based algorithm. In addition, only a small amount of expert demonstrations (not even in the form of trajectories but simple transitions) is required. Keywords: inverse reinforcement learning, structured multi-class classi cation

Domaines

Apprentissage [cs.LG]

Sébastien Van Luchene : Connectez-vous pour contacter le contributeur

https://centralesupelec.hal.science/hal-00749524

Soumis le : mercredi 7 novembre 2012-16:27:20

Dernière modification le : mardi 14 février 2023-03:37:58

Dates et versions

hal-00749524 , version 1 (07-11-2012)

Identifiants

HAL Id : hal-00749524 , version 1

Citer

Edouard Klein, Bilal Piot, Matthieu Geist, Olivier Pietquin. Structured Classification for Inverse Reinforcement Learning. EWRL 2012, Jun 2012, Edinburgh, United Kingdom. pp.1-14. ⟨hal-00749524⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

SUPELEC CENTRALESUPELEC

150 Consultations

0 Téléchargements

Structured Classification for Inverse Reinforcement Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager