Modern Reinforcement-learning using Deep Learning

Rating 1.0 out of 5 (2 ratings in Udemy)

What you'll learn

Being able to start Deep reinforcement-learning research
Being able to start Deep reinforcement-learning engineering role
Understand modern state-of-the-art Deep reinforcement-learning knowledge
Understand Deep reinforcement-learning knowledge

Description

Hello I am Nitsan Soffair, A Deep RL researcher at BGU.

In my Deep reinforcement-learning course you will learn the newest state-of-the-art Deep reinforcement-learning …

Rating 1.0 out of 5 (2 ratings in Udemy)

What you'll learn

Being able to start Deep reinforcement-learning research
Being able to start Deep reinforcement-learning engineering role
Understand modern state-of-the-art Deep reinforcement-learning knowledge
Understand Deep reinforcement-learning knowledge

Description

Hello I am Nitsan Soffair, A Deep RL researcher at BGU.

In my Deep reinforcement-learning course you will learn the newest state-of-the-art Deep reinforcement-learning knowledge.

You will do the following

Get state-of-the-art knowledge regarding
1. Model types
2. Algorithms and approaches
3. Function approximation
4. Deep reinforcement-learning
5. Deep Multi-agent Reinforcement-learning
Validate your knowledge by answering short and very short quizzes of each lecture.
Be able to complete the course by ~2 hours.

Syllabus

Model types
1. Markov decision process (MDP)
  A discrete-time stochastic control process.
2. Partially observable Markov decision process (POMDP)
  A generalization of MDP in which an agent cannot observe the state.
3. Decentralized Partially observable Markov decision process (Dec-POMDP)
  A generalization of POMDP to consider multiple decentralized agents.
Algorithms and approaches
1. Bellman equations
  A condition for optimality of optimization of dynamic programming.
2. Model-free
  A model-free algorithm is an algorithm which does not use the policy of the MDP.
3. Off-policy
  An off-policy algorithm is an algorithm that use policy 1 for learning and policy 2 for acting in the environment.
4. Exploration-exploitation
  A trade-off in Reinforcement-learning between exploring new policies to use existing policies.
5. Value-iteration
  An iterative algorithm applying bellman optimality backup.
6. SARSA
  An algorithm for learning a Markov decision process policy
7. Q-learning
  A model-free reinforcement learning algorithm to learn the value of an action in a particular state.
Function approximation
1. Function approximators
  The problem asks us to select a function among a well-defined class that closely matches ("approximates") a target function in a task-specific way.
2. Policy-gradient
  Value-based, Policy-based, Actor-critic, policy-gradient, and softmax policy
3. REINFORCE
  A policy-gradient algorithm.
Deep reinforcement-learning
1. Deep Q-Network (DQN)
  A deep reinforcement-learning algorithm using experience reply and fixed Q-targets.
2. Deep Recurrent Q-Learning (DRQN)
  Deep reinforcement-learning algorithm for POMDP extends DQN and uses LSTM.
3. Optimistic Exploration with Pessimistic Initialization (OPIQ)
  A deep reinforcement-learning for MDP based on DQN.
4. Value Decomposition Networks (VDN)
  A multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
5. QMIX
  A multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
6. QTRAN
  A multi-agent deep reinforcement-learning algorithm for Dec-POMDP.
7. Weighted QMIX
  A deep multi-agent reinforcement-learning for Dec-POMDP.

Resources

Wikipedia
David Silver's Reinforcement-learning course

Duration 0 Hours 58 Minutes

Free

Self paced

All Levels

English (US)

1679

Rating 1.0 out of 5 (2 ratings in Udemy)

Go to the Course
We have partnered with providers to bring you collection of courses, When you buy through links on our site, we may earn an affiliate commission from provider.