In this course, you will learn how to solve problems with large, high-dimensional, and potentially infinite state spaces. You will see that estimating value functions can be cast as a supervised learning problem---function approximation---allowing you to build agents that carefully balance generalization and discrimination in order to maximize reward. We will begin this journey by investigating how our policy evaluation or prediction methods like Monte Carlo and TD can be extended to the function approximation setting. You will learn about feature construction techniques for RL, and representation learning via neural networks and backprop. We conclude this course with a deep-dive into policy gradient methods; a way to learn policies directly without learning a value function. In this course you will solve two continuous-state control tasks and investigate the benefits of policy gradient methods in a continuous-action environment.
- 5 stars84.26%
- 4 stars12.93%
- 3 stars2%
- 2 stars0.53%
- 1 star0.26%
来自PREDICTION AND CONTROL WITH FUNCTION APPROXIMATION的热门评论
this course bridged the gap to Deep Learning, the most exciting direction in RL. I would like a sequel dedicated to this from U Alberta
Super interesting, challenging but the videos are very helpful to complement the understanding of the Sutton and Barto RL book. Thanks the Univ. of Alberta team!
This specialization is a gift to humanity. It should have been inscribed into the golden disc of the Voyager and shared with the aliens.
Difficult but excellent and impressing. Human being is incredible creating such ideas. This course shows a way to the state when all such ingenious ideas will be created by self learning algorithms.