E. Muhrer and M. Vollrath, Results from accident analysis ISI-PADAS project deliverable, 2009.

M. Vollrath, Ableitung von anforderungen an ein fahrerassistenzsystem aus sicht der verkehrssicherheit, Bundesanstalt für Straßenwesen, 2006.

S. Briest and M. Vollrath, In welchen situationen machen fahrer welche fehler? ableitung von anforderungen an fahrerassistenzsysteme durch, Integrierte Sicherheit und Fahrerassistenzsysteme, pp.449-463, 2006.

R. J. Kiefer, J. Salinger, and J. J. Ference, Status of nhtsa's rear-end crash prevention research program, National Highway Traffic and Safety Administration, 2005.

J. D. Lee, D. V. Mcgehee, T. L. Brown, and M. L. Reyes, Collision Warning Timing, Driver Distraction, and Driver Response to Imminent Rear-End Collisions in a High-Fidelity Driving Simulator, Human Factors: The Journal of the Human Factors and Ergonomics Society, vol.44, issue.2, pp.314-334, 2002.
DOI : 10.1518/0018720024497844

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), 1998.
DOI : 10.1007/978-1-4615-3618-5

R. Bellman, A Markovian Decision Process, Indiana University Mathematics Journal, vol.6, issue.4, pp.679-684, 1957.
DOI : 10.1512/iumj.1957.6.56038
URL : http://doi.org/10.1512/iumj.1957.6.56038

K. Kuhnert and M. Krödel, Autonomous Vehicle Steering Based on Evaluative Feedback by Reinforcement Learning, Lecture Notes in Computer Science, vol.3587, pp.405-414, 2005.
DOI : 10.1007/11510888_40

T. Martinez-marin, A reinforcement learning algorithm for optimal motion of car-like vehicles, Proceedings. The 7th International IEEE Conference on Intelligent Transportation Systems (IEEE Cat. No.04TH8749), pp.47-51, 2004.
DOI : 10.1109/ITSC.2004.1398870

S. Oh, J. Lee, and D. Choi, A new reinforcement learning vehicle control architecture for vision-based road following, IEEE Transactions on Vehicular Technology, vol.49, issue.3, pp.997-1005, 2000.

M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1994.
DOI : 10.1002/9780470316887

G. Tesauro, Temporal difference learning and TD-Gammon, Communications of the ACM, vol.38, issue.3, 1995.
DOI : 10.1145/203330.203343

S. J. Bradtke and A. G. Barto, Linear Least-Squares algorithms for temporal difference learning, Machine Learning, pp.33-57, 1996.
DOI : 10.1007/978-0-585-33656-5_4
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143.857

M. G. Lagoudakis and R. Parr, Least-squares policy iteration, Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003.

G. Gordon, Stable Function Approximation in Dynamic Programming, Proceedings of the International Conference on Machine Learning (IMCL 95), 1995.
DOI : 10.1016/B978-1-55860-377-6.50040-2

C. Watkins, Learning from delayed rewards, 1989.

M. Geist and O. Pietquin, Kalman Temporal Differences, Journal of Artificial Intelligence Research, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00351297

Y. Engel, S. Mannor, and R. Meir, Reinforcement learning with Gaussian processes, Proceedings of the 22nd international conference on Machine learning , ICML '05, 2005.
DOI : 10.1145/1102351.1102377

A. L. Strehl and M. L. Littman, An analysis of model-based Interval Estimation for Markov Decision Processes, Journal of Computer and System Sciences, vol.74, issue.8, 2006.
DOI : 10.1016/j.jcss.2007.08.009

C. Atkeson and J. Santamaria, A comparison of direct and model-based reinforcement learning, Proceedings of International Conference on Robotics and Automation, pp.3557-3564, 1997.
DOI : 10.1109/ROBOT.1997.606886