本文原文见我的知乎主页:https://www.zhihu.com/people/ikerpeng/
参考:
- David Silver,Tutorial: Deep Reinforcement Learning,2016.
- Pieter Abbeel,Policy Optimization,2017.
- Hodo van Hasselt,Deep reinforcement Learning,2017.
- R. Sutton, RL:An introduction 2nd,2017