Applied Reinforcement Learning

Applied Reinforcement Learning
Lecturer: Hao Shen
Assistant: Martin Gottwald
Targeted Audience: Wahlfach, Ergänzungsvorlesung (Master)
Umfang: 2/2 (SWS Lecture/Tutorial)
Term: Summer
Registration: TUMonline
Start registration: 05.02.2018, 8:00 am
Time & Place: Lecture: 26 - 28.03.2018 & 04 - 06.04.2018, in Z995, 9:00 - 17:00
Tutorial: during Semester on Thursdays, 13:15 - 14:45, Z995
Individual Question Session: during Semester on Thursdays, 10:00 - 12:00
Start: First lecture 26.03.2018 (attendance is required, see below)


Reinforcement learning (RL) is one most powerful approach in solving sequential decision making problems. A reinforcement learning agent interacts with its environment and uses its experience to make decisions towards solving the problem. The technique has succeeded in various applications of operation research, robotics, game playing, network management, and computational intelligence.

This lecture provides an overview of basic concepts, practical techniques, and programming tools used in reinforcement learning. Specifically, it focuses on the application aspects of the subject, such as problem solving and implementations. By design, it aims to complement the theoretical treatment of the subject, such as mathematical derivation, convergence proves, and bound analysis, which are covered in the lecture "Approximate Dynamic Programming and Reinforcement Learning" in winter semesters.

In this lecture, we will cover the following topics (not exclusively):

  • Reinforcement learning problems as Markov decision processes
  • Dynamic programming (value iteration and policy iteration)
  • Monte Carlo reinforcement learning methods
  • Temporal difference learning (SARSA and Q learning)
  • Simulation-based reinforcement learning algorithms
  • Linear value function approximation, e.g. tile coding

The course project is done in groups, each group works on a physical robot. Currently we can provide:

  • Poppy Humanoid
  • Poppy Ergo
  • Stem Kit Level 1 & 2
  • Turtlebot
  • Metabot V2
  • E-Puck

It is possible to extend the existing robots during the project ( e.g. add new sensors ).

On completion of this course, students are able to:

  • describe classic scenarios of reinforcement learning problems;
  • explain basics of reinforcement learning methods;
  • model real engineering problems using reinforcement learning methods;
  • compare performance of the reinforcement learning algorithms that are covered in the course practically in the specific projects;
  • select proper reinforcement learning algorithms in accordance with specific problems, and argue their choices;
  • construct and implement reinforcement learning algorithms to solve simple robotics problems on physical systems

Important Information

Due to the limited number of available robots, the number of participants has to be restricted. Please mind the following procedure:

  • If you have interest in the course, sign up on TUMOnline
  • Visit the block course before the semester starts (mandatory)
  • The positions for the practical part of the lecture will be filled from all people signed up and the waiting list (of course only those students that attended the complete lecture). On the last lecture day (06.04.2018) you can choose your team members for the project.
  • Once you sign up and attended the complete lecture, you are committed to the course and thus block a robot. If you skip the course lateron you will prevent other students from taking the course. Only sign up if you are sure to stay in the course for the whole semester

Lecture Details

The lecture consists of two phases:

  1. six days frontal teaching sessions before the semester starts
  2. weekly tutorial sessions (two hours per week) throughout the semester
  3. Additional practical and question sessions

Time and location are at the top of this page. First lecture is on Monday, 26.03.2018, starting at 09:00.


  • Sutton, R. S. & Barto, A. G., Reinforcement Learning: An Introduction. The MIT Press, 1998 (version 2 in progress)
  • Bertsekas, D. P. & Tsitsiklis, J., Neuro-dynamic programming. Athena Scientific, 1996
  • Bertsekas, D. P., Dynamic Programming and Optimal Control Vol. 1 & 2.
  • Szepesvári, S., Algorithms for Reinforcement Learning. Morgan & Claypool, 2010 (a draft)

Target Audience and Signup

Students in a Masters degree program. Registration via TUMOnline.