Multimodal information processing and analysis

Theodoros Giannakopoulos

Περιγραφή

Modern multimedia databases can contain millions of files such as videos, digital music collections, image archives and measurements from wearables. High-level semantic descriptions of such multimedia content are crucial for several application domains such as hybrid recommender and personalization systems, multimodal security applications, environmental monitoring and health monitoring systems. Until recently, the major focus of machine learning and data mining research has either focused on structured types of data (such as text or metadata) or artificial supervised and/or unsupervised benchmarks which differ a lot from real-world and multimodal use cases. The recent advances in computer vision, speech recognition and deep learning has provided the scientific community with the “tools” to focus on a wide range of machine learning applications that take into consideration multiple modalities.

This course provides an introduction to processing and analysis of all basic modalities from

Περισσότερα  

Ημερολόγιο

Ανακοινώσεις