This thesis studies machine learning techniques for localizing, separating and recognizing audio-visual events. Until recently, the widely-used methods for analyzing audio-visual events involves either laborious pre-processing/post-processing steps to handle videos, or a huge amount of training data to supervise their learning processes. To overcome these limitations, we aim to develop novel approaches that have one end-to-end framework, and much less dependence on training data than state-of-the-art deep-learning approaches. In particular, we propose novel low-rank and sparse matrix decomposition methods, kernelized matrix decomposition methods and deep neural networks for audio-visual event analysis, with a particular focus on their data-...
Since when very young, we can quickly learn new concepts, and distinguish between different kinds of...
This electronic version was submitted by the student author. The certified thesis is available in th...
International audienceAudiovisual (AV) representation learning is an important task from the perspec...
The ability to localize visual objects that are associated with an audio source and at the same time...
The ability to localize visual objects that are associated with an audio source and at the same time...
Hearing sense has an important role in our daily lives. During the recent years, there has been many...
The automatic recognition of sound events by computers is an important aspect of emerging applicatio...
As an important information carrier, sound carries abundant information about the environment, which...
This HDR manuscript summarizes our work concerning the applications of machine learning techniques t...
The objective of this thesis is to develop novel classification and feature learning techniques for t...
The ability to localize visual objects that are associated with an audio source and at the same time...
Deep learning techniques such as deep feedforward neural networks and deep convolutional neural netw...
Whether crossing the road or enjoying a concert, sound carries important information about the world...
In this paper, we present a gated convolutional neural network and a temporal attention-based local...
This electronic version was submitted by the student author. The certified thesis is available in th...
Since when very young, we can quickly learn new concepts, and distinguish between different kinds of...
This electronic version was submitted by the student author. The certified thesis is available in th...
International audienceAudiovisual (AV) representation learning is an important task from the perspec...
The ability to localize visual objects that are associated with an audio source and at the same time...
The ability to localize visual objects that are associated with an audio source and at the same time...
Hearing sense has an important role in our daily lives. During the recent years, there has been many...
The automatic recognition of sound events by computers is an important aspect of emerging applicatio...
As an important information carrier, sound carries abundant information about the environment, which...
This HDR manuscript summarizes our work concerning the applications of machine learning techniques t...
The objective of this thesis is to develop novel classification and feature learning techniques for t...
The ability to localize visual objects that are associated with an audio source and at the same time...
Deep learning techniques such as deep feedforward neural networks and deep convolutional neural netw...
Whether crossing the road or enjoying a concert, sound carries important information about the world...
In this paper, we present a gated convolutional neural network and a temporal attention-based local...
This electronic version was submitted by the student author. The certified thesis is available in th...
Since when very young, we can quickly learn new concepts, and distinguish between different kinds of...
This electronic version was submitted by the student author. The certified thesis is available in th...
International audienceAudiovisual (AV) representation learning is an important task from the perspec...