|Umfang:||2/1/0 (SWS Vorlesung/Übung/Praktikum)|
|Zeit & Ort:|
Please note: this lecture is part of a module and can be registered only in combination with Practical Concepts of Machine Learning!
Deep Learning for Multimedia:
Content generated for human consumption in the form of video, text, or audio, is unstructured from a machine perspective since the contained information is not readily available for processing. Information extraction from unstructured data describes therefore how one can extract the salient information from generic content in order to generate a descriptive structured representation. The thus created meta-data can then be further processed automatically, in particular for creating models explaining or predicting samples e.g. in recommendation systems. Aim of this lecture is therefore to introduce the methods, algorithms and underlying machine learning concepts for extracting information from audio, visual, and textual unstructured content using state-of-the art algorithms, especially deep learning based algorithms and architectures e.g. CNN, Autoencoder, LTSM. In addition, existing frameworks and libraries (e.g. Keras, Scikit-learn) and how to use them with audio, visual, and textual content countered in (multi-) media applications and services will be discussed. The following topics will be covered: - Why information extraction? - Introduction to deep learning - Image/video content - Object recognition - Face recognition - Character recognition (OCR) - Quality of Experience (QoE) - Audio/textual content - Automatics speech recognition (ASR) - Natural language processing (NLP) - Python eco-system of frameworks/libraries for information extraction Selected topics will be examined more in-depth during the lecture and the team oriented semester project.