Inverse Reinforcement Learning through Structured Classification

Abstract : This paper adresses the inverse reinforcement learning (IRL) problem, that is inferring a reward for which a demonstrated expert behavior is optimal. We introduce a new algorithm, SCIRL, whose principle is to use the so-called feature expectation of the expert as the parameterization of the score function of a multi-class classifier. This approach produces a reward function for which the expert policy is provably near-optimal. Contrary to most of existing IRL algorithms, SCIRL does not require solving the direct RL problem. Moreover, with an appropriate heuristic, it can succeed with only trajectories sampled according to the expert behavior. This is illustrated on a car driving simulator.
Document type :
Conference papers
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal-supelec.archives-ouvertes.fr/hal-00778624
Contributor : Sébastien van Luchene <>
Submitted on : Monday, January 21, 2013 - 11:26:06 AM
Last modification on : Wednesday, July 31, 2019 - 4:18:03 PM
Long-term archiving on : Monday, April 22, 2013 - 3:52:32 AM

File

NIPS2012_0491.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00778624, version 1

Collections

Citation

Edouard Klein, Matthieu Geist, Bilal Piot, Olivier Pietquin. Inverse Reinforcement Learning through Structured Classification. NIPS 2012, Dec 2012, Lake Tahoe, Nevada, United States. pp.1-9. ⟨hal-00778624⟩

Share

Metrics

Record views

700

Files downloads

273