Denna sida på svenska This page in English

Study Circle in Deep Reinforcement Learning


  • The final lecture has been cancelled due to time constraints
  • Meeting 3 and Meeting 4 have been moved to a mega lecture on Thursday 11 March 3:30 pm
  • The meetings are moved from 1:00 pm to 2:00pm every Wednesdays
  • The assignments have moved around for better understanding


Graduate/PhD level course on Deep Reinforcement Learning (RL) in study circle form. This course will build on the previous study circle on reinforcement learning given in 2019.

The recommended prerequisites are understanding of RL algorithms and basic knowledge of neural network creation and training using tensor flow. See initial lecture below for a crash course on the two topics.

The course will consist of 8 lectures (which you will watch before the session) and three assignments on application of concepts from the lecture. The lectures will be mainly select topics from Berkley Deep RL course, and UCL RL course. The assignments will follow this interesting free course on Github and some assignments from the Berkeley course. The course will lean more heavily on using RL to explore game environments and control techniques.

Course Responsible :- Gautham Nayak Seetanadi

Course Examiner:- Karl-Erik Årzen

Recurring Zoom meeting:


  • Meeting 0: 10th February 2021: 13:00 - 14:00
  • Meeting 1: 17th February 2021: 13:00 - 14:00
    • Deep Reinforcement Learning with Q functions
  • Meeting 2: 24th February 2021: 14:00 - 15:00
    • (Advanced) Policy Gradients
      • Watch the lecture on policy gradients Video, slides
      • Also go through lecture on advanced policy gradients Video, slides
      • Assignment 1: Got through Policy Gradients assignment (HW2 of course CS285)
        • The assignment is due in two weeks, on 10th March
  • Meeting 3: 3rd March 2021: 14:00 - 15:00
  • Meeting 4: 10th March 2021: 14:00 - 15:00
    • Model Based RL
      • Watch the lecture Video, slides
      • Assignment 1 due
      • Assignment 2: Deep Q network assignment (HW3 of course CS285)
        • Assignment 2 due in two weeks, on 24th of March
  • Meeting 5: 17th March 2021: 14:00 - 15:00
    • Model Based Policy Learning
  • Meeting 6: 24th March 2021: 14:00 - 15:00
    • Control as Inference
      • Watch the lecture Video, slides
      • Assignment 2 is due
      • Assignment 3:- Model-based RL assignment (HW4 of course CS285)
  • Meeting 7 [CANCELLED]: 31th March 2021: 14:00 - 15:00
    • Inverse Reinforcement Learning

Pointers for assignments:

  • All assignments require the MuJoCo license. The license is available for free to students but takes about 3 days to arrive so prepare early.
  • HW1 has information on how to set up the environment. Download the zip for HW1 here.

Additional Resources:

Courses on reinforcement learning:

  • UCL course by David Silver: Provides a basic introduction to concepts in RL
  • CMU course: In depth course in RL with many resources. Cannot find video lectures.
  • Github course: Small hands-on programming course on application of RL to video games

Environments to evaluate RL algorithms:

Why Deep RL might not be the answer: