Sample-based Learning Methods
In this course, you will learn about several algorithms that can learn near optimal policies based on trial and error interaction with the environment—learning from the agent’s own experience. Learning from actual experience is striking because it requires no prior knowledge of the environment’s dynamics, yet can still attain optimal behavior. We will cover intuitively simple but powerful Monte Carlo methods, and temporal difference learning methods including Q-learning. We will wrap up this course investigating how we can get the best of both worlds: algorithms that can combine model-based planning (similar to dynamic programming) and temporal difference updates to radically accelerate learning.By the end of this course you will be able to:
- Understand Temporal-Difference learning and Monte Carlo as two strategies for estimating value functions from sampled experience
- Understand the importance of exploration, when using sampled experience rather than dynamic programming sweeps within a model
- Understand the connections between Monte Carlo and Dynamic Programming and TD.
- Implement and apply the TD algorithm, for estimating value functions
- Implement and apply Expected Sarsa and Q-learning (two TD methods for control)
- Understand the difference between on-policy and off-policy control
- Understand planning with simulated experience (as opposed to classic planning strategies)
- Implement a model-based approach to RL, called Dyna, which uses simulated experience
- Conduct an empirical study to see the improvements in sample efficiency when using Dyna
None
Syllabus
Syllabus - What you will learn from this course
Week 1
Welcome to the Course!
Monte Carlo Methods for Prediction & Control
Week 2
Temporal Difference Learning Methods for Prediction
Week 3
Temporal Difference Learning Methods for Control
Week 4
Planning, Learning & Acting
FAQ
When will I have access to the lectures and assignments?
Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.
What is the refund policy?
If you subscribed, you get a 7-day free trial during which you can cancel at no penalty. After that, we don’t give refunds, but you can cancel your subscription at any time. See our full refund policy.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
Reviews
Everything is great overall but It would be more better if DynaQ & DynaQ+ were explained more detail in the lecture instead of assignment.
Really great resource to follow along the RL Book. IMP Suggestion: Do not skip the reading assignments, they are really helpful and following the videos and assignments becomes easy.
It's an important course in understanding the working of reinforcement learning. Although some important and complex topics are not explored in this course which are mentioned in the textbook.
Excellent course that naturally extends the first specialization course. The application examples in programming are very good and I loved how RL gets closer and closer to how a living being thinks.