The goal of this thesis is to design algorithms that enable robust detection of objectsand events in videos through joint audio-visual analysis. This is motivated by humans’remarkable ability to meaningfully integrate auditory and visual characteristics forperception in noisy scenarios. To this end, we identify two kinds of natural associationsbetween the modalities in recordings made using a single microphone and camera,namely motion-audio correlation and appearance-audio co-occurrence.For the former, we use audio source separation as the primary application andpropose two novel methods within the popular non-negative matrix factorizationframework. The central idea is to utilize the temporal correlation between audio andmotion for objects/...
In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of ...
In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of ...
International audienceThis chapter addresses sound scene and event classification in multiview setti...
The goal of this thesis is to design algorithms that enable robust detection of objectsand events in...
International audienceAudiovisual (AV) representation learning is an important task from the perspec...
Research articleAcoustic event detection (AED) aims at determining the identity of sounds and their ...
International audienceIn this paper, we propose a novel method which is able to detect and separate ...
This thesis work focuses on the computational analysis of environmental sound scenes and events. The...
This paper presents a novel method to correlate audio and visual data generated by the same physical...
Current computer vision techniques can effectively monitor gross activities in sparse environments. ...
Acoustic events produced in meeting environments may contain useful information for perceptually awa...
Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn ...
Technological developments and innovations of the first forty years of the digital era have primaril...
Audio-visual event detection aims to identify semantically defined events that reveal human activiti...
Copyright © 2011 Taras Butko et al. This is an open access article distributed under the Creative Co...
In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of ...
In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of ...
International audienceThis chapter addresses sound scene and event classification in multiview setti...
The goal of this thesis is to design algorithms that enable robust detection of objectsand events in...
International audienceAudiovisual (AV) representation learning is an important task from the perspec...
Research articleAcoustic event detection (AED) aims at determining the identity of sounds and their ...
International audienceIn this paper, we propose a novel method which is able to detect and separate ...
This thesis work focuses on the computational analysis of environmental sound scenes and events. The...
This paper presents a novel method to correlate audio and visual data generated by the same physical...
Current computer vision techniques can effectively monitor gross activities in sparse environments. ...
Acoustic events produced in meeting environments may contain useful information for perceptually awa...
Visual events are usually accompanied by sounds in our daily lives. However, can the machines learn ...
Technological developments and innovations of the first forty years of the digital era have primaril...
Audio-visual event detection aims to identify semantically defined events that reveal human activiti...
Copyright © 2011 Taras Butko et al. This is an open access article distributed under the Creative Co...
In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of ...
In this paper, we propose a novel method that exploits correlation between audio-visual dynamics of ...
International audienceThis chapter addresses sound scene and event classification in multiview setti...