Applied Reinforcement Learning
|Targeted Audience:||Wahlfach, Ergänzungsvorlesung (Master)|
|Umfang:||2/2 (SWS Lecture/Tutorial)|
|Start registration:||01.03.2019, 8:00 am|
|Time & Place:|
|Lecture:||01 - 03.04.2019 & 08 - 10.04.2019
in Z995, 9:00 - 17:00
|Tutorial/Exercise:|| during Semester Thursdays, 13:15 - 14:45, Z995
|Question Session:||during Semester Thursdays, 10:00 - 12:00
|First lecture in Semester:||25.04.2019|
Reinforcement learning (RL) is one most powerful approach in solving sequential decision making problems. A reinforcement learning agent interacts with its environment and uses its experience to make decisions towards solving the problem. The technique has succeeded in various applications of operation research, robotics, game playing, network management, and computational intelligence.
This lecture provides an overview of basic concepts, practical techniques, and programming tools used in reinforcement learning. Specifically, it focuses on the application aspects of the subject, such as problem solving and implementations. By design, it aims to complement the theoretical treatment of the subject, such as mathematical derivation, convergence proves, and bound analysis, which are covered in the lecture "Approximate Dynamic Programming and Reinforcement Learning" in winter semesters.
In this lecture, we will cover the following topics (not exclusively):
- Reinforcement learning problems as Markov decision processes
- Dynamic programming (value iteration and policy iteration)
- Monte Carlo reinforcement learning methods
- Temporal difference learning (SARSA and Q learning)
- Simulation-based reinforcement learning algorithms
- Linear value function approximation, e.g. tile coding
The course project is done in groups of three, each group works on a physical robot. Currently we can provide:
- Poppy Humanoid
- Poppy Ergo
- Stem Kit Level 1 & 2
- Metabot V2
It is possible to extend the existing robots during the project ( e.g. add new sensors, more construction parts, addtional equipment required for projects etc. ).
On completion of this course, students are able to:
- describe classic scenarios of reinforcement learning problems;
- explain basics of reinforcement learning methods;
- model real engineering problems using reinforcement learning methods;
- compare performance of the reinforcement learning algorithms that are covered in the course practically in the specific projects;
- select proper reinforcement learning algorithms in accordance with specific problems, and argue their choices;
- construct and implement reinforcement learning algorithms to solve simple robotics problems on physical systems
Due to the limited number of available robots, the number of participants has to be restricted. Please mind the following procedure:
- If you have interest in the course, sign up on TUMOnline
- Visit the block course before the semester starts (mandatory)
- The places for the practical part of the lecture will be filled from all people showing up on the last lecture day (10.04.2019). The ordering is given by the waiting list.
- Once you attended the complete lecture and sign up for the practical part, you are committed to the course and thus block a robot. If you skip the course lateron you will prevent other students from taking the course. Only sign up if you are sure to stay in the course for the whole semester!
The lecture consists of two phases:
- six days frontal teaching sessions before the semester starts
- weekly tutorial sessions (two hours per week) throughout the semester
- Additional practical and question sessions
Time and location are at the top of this page.
- Sutton, R. S. & Barto, A. G., Reinforcement Learning: An Introduction. The MIT Press, 1998 (version 2 in progress)
- Bertsekas, D. P. & Tsitsiklis, J., Neuro-dynamic programming. Athena Scientific, 1996
- Bertsekas, D. P., Dynamic Programming and Optimal Control Vol. 1 & 2.
- Szepesvári, S., Algorithms for Reinforcement Learning. Morgan & Claypool, 2010 (a draft)
Target Audience and Signup
Students in a Masters degree program. Registration via TUMOnline.