Approximate Dynamic Programming
Dozent: Hao Shen
Assistenten: Dominik Meyer
Zielgruppe: Master
Umfang: Ergänzungsvorlesung (2/3/0 SWS Vorlesung/Übung/Praktikum)
Turnus: Wintersemester
Zeit & Ort: Vorlesung: Dienstags, 13.15 - 14.45 h in Z995
Übung: Mittwochs, 13.15 - 14.45 h in 0999
Beginn: erste Vorlesung vorauss. 13.10.2015


Approximate dynamic programming (ADP) and reinforcement learning (RL) are two closely related paradigms for solving sequential decision making problems. ADP methods tackle the problems by developing optimal control methods that adapt to uncertain systems over time, while RL algorithms take the perspective of an agent that optimizes its behavior by interacting with its environment and learning from the feedback received. Both technologies have succeeded in applications of operation research, robotics, game playing, network management, and computational intelligence.

We will cover the following topics (not exclusively):

  • Markov decision processes
  • Dynamic programming
  • Approximate dynamic programming
  • Reinforcement learning
  • Policy gradient algorithms
  • Partially observable Markov decision processes

Zielgruppe und Anmeldung

The course communication will be handled through the moodle page.

On completion of this course, students are able to:

  • describe classic scenarios in sequential decision making problems;
  • explain basic models of ADP/RL methods;
  • derive ADP/RL algorithms that are covered in the course;
  • characterize convergence properties of the ADP/RL algorithms covered in the course;
  • compare performance of the ADP/RL algorithms that are covered in the course, both theoretically and practically;
  • select proper ADP/RL algorithms in accordance with specific applications;
  • construct and implement ADP/RL algorithms to solve simple robotics problems on the e-puck platform.