D. P. Bertsekas, Dynamic Programming and Optimal Control, Athena Scientific, vol.3, 1995.

T. Kovacs and R. Egginton, On the analysis and design of software for reinforcement learning, with a survey of existing systems, Machine Learning, vol.8, issue.7, pp.7-49, 2011.
DOI : 10.1007/s10994-011-5237-8