International audienceThis paper addresses the issues of detecting and localizing objects in a scene that are both seen and heard. We explain the benefits of a human-like configuration of sensors (binaural and binocular) for gathering auditory and visual observations. It is shown that the detection and localization problem can be recast as the task of clustering the audio-visual observations into coherent groups. We propose a probabilistic generative model that captures the relations between audio and visual observations. This model maps the data into a common audio-visual 3D representation via a pair of mixture models. Inference is performed by a version of the expectationmaximization algorithm, which is formally derived, and which provide...
AbstractSituational awareness is achieved naturally by the human senses of sight and hearing in comb...
Abstract—This paper addresses the problem of localizing audio sources using binaural measurements. W...
In this thesis, the modelling of audio-visual perception with a head-like device is considered. The ...
International audienceThis paper addresses the issues of detecting and localizing objects in a scene...
International audienceWe address the issue of identifying and localizing individuals in a scene that...
In this paper we address the problem of detecting and localizing objects that can be both seen and h...
In this paper we address the problem of detecting and localizing objects that can be both seen and h...
International audienceIn this paper we address the problem of detecting and locating speakers using ...
In this paper we address the problem of detecting and localizing objects that can be both seen and h...
Audio-visual tracking of an unknown number of concurrent speakers in 3D is a challenging task, espec...
PhD ThesisThis thesis concerns the problem of target localization and tracking in an indoor environm...
Compact multi-sensor platforms are portable and thus desirable for robotics and personal-assistance ...
The human auditory system has the striking ability to robustly localize and recognize a specific tar...
Humans can robustly recognize and localize objects by integrating visual and auditory cues. While ma...
International audienceThe problem of multimodal clustering arises whenever the data are gathered wit...
AbstractSituational awareness is achieved naturally by the human senses of sight and hearing in comb...
Abstract—This paper addresses the problem of localizing audio sources using binaural measurements. W...
In this thesis, the modelling of audio-visual perception with a head-like device is considered. The ...
International audienceThis paper addresses the issues of detecting and localizing objects in a scene...
International audienceWe address the issue of identifying and localizing individuals in a scene that...
In this paper we address the problem of detecting and localizing objects that can be both seen and h...
In this paper we address the problem of detecting and localizing objects that can be both seen and h...
International audienceIn this paper we address the problem of detecting and locating speakers using ...
In this paper we address the problem of detecting and localizing objects that can be both seen and h...
Audio-visual tracking of an unknown number of concurrent speakers in 3D is a challenging task, espec...
PhD ThesisThis thesis concerns the problem of target localization and tracking in an indoor environm...
Compact multi-sensor platforms are portable and thus desirable for robotics and personal-assistance ...
The human auditory system has the striking ability to robustly localize and recognize a specific tar...
Humans can robustly recognize and localize objects by integrating visual and auditory cues. While ma...
International audienceThe problem of multimodal clustering arises whenever the data are gathered wit...
AbstractSituational awareness is achieved naturally by the human senses of sight and hearing in comb...
Abstract—This paper addresses the problem of localizing audio sources using binaural measurements. W...
In this thesis, the modelling of audio-visual perception with a head-like device is considered. The ...