Olivier Pietquin, Matthieu Geist, Senthilkumar Chandramohan, Hervé Frezza-Buet. Sample-Efficient Batch Reinforcement Learning for Dialogue Management Optimization.
ACM - Transactions on Speech and Language Processing, Association for Computing Machinery, 2011, 7 (3), pp.art. 7 (1-21).
⟨10.1145/1966407.1966412⟩.
⟨hal-00617517⟩