Persönlicher Status und Werkzeuge

Topics in Multimedia Signal Processing
Dozent: Klaus Diepold + wissenschaftliche Mitarbeiter
Assistenten:  
Zielgruppe: Studierende im MScEI und MSCE
ECTS: 3
Umfang: (SWS 2/0/0)
Turnus: Sommersemester
Anmeldung: keine Anmeldung erforderlich
Zeit & Ort: Freitags, 14:00-15:30, Seminarraum Z995,
Beginn: erste Vorlesung am 19.4.2013

Content

The course consists of a sequence of presentations held by research assistants of the institute, which cover the ongoing research work at the LDV. The research assistants explain their research projects for students and interested researchers. The course may serve as an orientation for students at the masters level if they are about to choose a topic for their master thesis, it may also serve to inform the students about the international research level concerning the respective topics.

Essays

Please use the IEEE conference template for your essays with the following settings:
\documentclass[conference,a4paper,onecolumn,draftclsnofoot]{IEEEtran}

If you don't want to use the Latex template make sure to use A4 paper size, single column and double line spacing.

Please send your essays to zwick@tum.de duly in time.

Deadlines:

  • 16.05.2013: Two pages essay about EITHER
    • Signal processing for robotic sound source localization OR 
    • Integrated accelerometry-based fall detection and fall risk prediction system
  • 06.06.2013: Three pages essay about BOTH topics
    • View synthesis for stereoscopic 3D television AND
    • Machine learning for solving invers problems in vision
  • 20.06.2013:  Three pages essay about BOTH topics
    • Robust PCA and subspace tracking for background subtraction in video AND
    • Efficient rank structured solvers for variational image processing
  • 05.07.2013: Three pages essay about BOTH topics
    • Machine learning in cognitive systems AND
    • Reinforcement learning for robotic navigation

Evaluation criteria: Paper is submitted duly in time, paper is correctly formatted, linguistic diligence, central questions addressed, paper also incorporates personal statements/ideas of the author.

Schedule

19.04.2013, Michael Zwick:
Introduction
Please note that we updated the Slides in order to be more precise on the essay and the number of pages you have to write.

26.04.2013, Tim Habigt:
Signal Processing for Robotic Sound Source Localization
Sound source localization algorithms analyze try to determine the position of a source in the environment by analyzing recorded observations of its signal. Together with signal processing techniques like beamforming, robots can localize sound sources, amplify desired source signals from particular directions and suppress unwanted noise. This lecture will give an overview of different sound source localization strategies and show their application in robotic scenarios.

Lecture slides
Matlab scripts

03.05.2013, Cristina Soaz:
Integrated accelerometry-based Fall Detection and Fall Risk Prediction System
Falls have a increasing social and economic impact, specially in the developed countries, as life expectancy of our elderly rises dramatically. Fall-related injuries increases morbidity, mortality and premature use of health care services. It has been shown that the most cost-effective fall reduction interventions are the ones which target groups at high risk. Therefore, identifying the risk factors for falling is crucial for the success of these programmes. Gait and balance control degeneration in the elderly and in most persons with neurological and musculoskeletal disorders, like MS or Osteoporosis, is one of these major risk factors. Unfortunately the technologies used in gait laboratories (force plates, video recording systems, ...) are too expensive for primary care clinical settings and non-portable to permit quantifying the patient's status at any point in time or under real-life environments. Recently, accelerometers have been presented as a portable and feasible alternative giving results with plausible strong correlations to force plate signals and good ecological validity. This talk presents a new approach for the development of an integrated fall risk prediction and fall detection system using a single waist-worn accelerometer linked to an Android mobile phone.

Lecture Slides

 

10.05.2013 omitted

17.05.2013, Julian Habigt:
View Synthesis for Stereoscopic 3-D Television
Stereoscopic 3-D television creates the impression of depth by providing two views of a scene to the eyes of the viewer. Autostereoscopic displays free the user from having to wear glasses, but require a much larger number of views of the same scene, i.e. 28 views and more. As it is not feasible to use such a large number of cameras, there is a need for synthesizing high-quality intermediate views between the pictures of two cameras. This lecture will give an introduction into the fundamentals of view synthesis and the challenges that arise.
Lecture Slides
Matlab Scripts
References

24.05.2013, Martin Kiechle, Julian Wörmann
Machine Learning for Solving Inverse Problems in Vision
Linear inverse problems are ubiquitous in the field of image processing. Prominent examples are image deconvolution, denoising, inpainting, upsampling, or image reconstruction from few indirect measurements as in Compressive Sensing. Basically, in all these problems the goal is to reconstruct an unknown image as accurately as possible from a set of indirect and maybe corrupted measurements. In most interesting cases, this process is highly ill-posed, hence, prior knowledge or assumptions about the general statistics of images have to be exploited. One assumption that has proven to be successful in image recovery is that natural image can be represented as a linear combination of very few atoms of a dictionary. The fewer atoms required, the better the dictionary, and the better the reconstruction quality. Thereby, a dictionary can be either given analytically like a wavelet basis or the overcomplete discrete cosine transform, or it can be learned from a set of training samples. Analytic dictionaries offer the advantage of being applicable to a broad set of images, whereas learned dictionaries lead to sparser representations and better reconstruction quality for images closely related to the training data.
Lecture Slides
Matlab Scripts

31.05.2013 omitted

07.06.2013, Clemens Hage:
Robust PCA and Subspace Tracking for Background Subtraction in Video
Material

14.06.2013, Sunil Ramgopal:
Efficient Rank Structured Solvers for Variational Image Processing
Variational approaches for modeling are very common in medical image processing and computer vision, e.g. motion detection by optical flow, image registration, image segmentation, image denoising, etc. The main bottleneck of variational approaches is the sheer amount of data arising from the inherent optimization process. Increasing sophistication and resolution in imaging sensors (ranging from cameras to medical imaging devices) further increases the data to be processed, leading to inhibitive memory and time consumption. We believe the solution to this problem can only be achieved by the combination of a) robust and efficient numerical schemes and b) optimized software implementation. In tune with these criteria, we are particularly interested in investigating the promise offered by established rank structured approaches such as sequentially and hierarchically semi-separable systems.

EVIP.pdf

21.06.2013, Johannes Günther:
Machine learning in cognitive systems
As the laser welding process is hard to control with the standard approach, we use reinforcement learning. The process is observed via camera and other optical sensors, so we get the information about the process mainly in images. One task is, to process certain algorithms on these images, to extract the relevant information. With this input, we can train machine learning algorithms to master the process.

28.06.2013, Martin Knopp:
Reinforcement Learning for Robotic Navigation
This lecture gives an brief introduction on Reinforcement Learning and how it is used to enable robots to navigate in an unknown environment while simultaneously building a map of this environment (SLAM problem). We present our experiment setup with four small robots and how they might combine their knowledge using Manifold representations techniques to combine their local maps to a global one.