The ability to localize visual objects that are associated with an audio source and at the same time to separate the audio signal is a cornerstone in audio-visual signal-processing applications. However, available methods mainly focus on localizing only the visual objects, without audio separation abilities. Besides that, these methods often rely on either laborious preprocessing steps to segment video frames into semantic regions, or additional supervisions to guide their localization. In this paper, we aim to address the problem of visual source localization and audio separation in an unsupervised manner and avoid all preprocessing or post-processing steps. To this end, we devise a novel structured matrix decomposition method that decompo...
International audienceIn this work we present a method to perform a complete audiovisual source sepa...
International audienceReal-world phenomena involve complex interactions between multiple signal moda...
We propose a novel method to automatically detect and extract the video modality of the sound source...
The ability to localize visual objects that are associated with an audio source and at the same time...
The ability to localize visual objects that are associated with an audio source and at the same time...
In this work we present a method to jointly separate active audio and visual structures on a given m...
International audienceIn this paper, we propose a novel method which is able to detect and separate ...
Audio-visual separation aims to isolate pure audio sources from mixture with the guidance of its syn...
International audienceIn this work we present a method to perform a complete audiovisual source sepa...
This thesis studies machine learning techniques for localizing, separating and recognizing audio-vis...
In this work we present a method to perform a complete audiovisual source separation without need of...
We present a method of improving sound source separation using vision. The sound source separation i...
Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn ...
International audienceAudiovisual (AV) representation learning is an important task from the perspec...
This electronic version was submitted by the student author. The certified thesis is available in th...
International audienceIn this work we present a method to perform a complete audiovisual source sepa...
International audienceReal-world phenomena involve complex interactions between multiple signal moda...
We propose a novel method to automatically detect and extract the video modality of the sound source...
The ability to localize visual objects that are associated with an audio source and at the same time...
The ability to localize visual objects that are associated with an audio source and at the same time...
In this work we present a method to jointly separate active audio and visual structures on a given m...
International audienceIn this paper, we propose a novel method which is able to detect and separate ...
Audio-visual separation aims to isolate pure audio sources from mixture with the guidance of its syn...
International audienceIn this work we present a method to perform a complete audiovisual source sepa...
This thesis studies machine learning techniques for localizing, separating and recognizing audio-vis...
In this work we present a method to perform a complete audiovisual source separation without need of...
We present a method of improving sound source separation using vision. The sound source separation i...
Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn ...
International audienceAudiovisual (AV) representation learning is an important task from the perspec...
This electronic version was submitted by the student author. The certified thesis is available in th...
International audienceIn this work we present a method to perform a complete audiovisual source sepa...
International audienceReal-world phenomena involve complex interactions between multiple signal moda...
We propose a novel method to automatically detect and extract the video modality of the sound source...