In object-based spatial audio system, positions of the audio objects (e.g. speakers/talkers or voices) presented in the sound scene are required as important metadata attributes for object acquisition and reproduction. Binaural microphones are often used as a physical device to mimic human hearing and to monitor and analyse the scene, including localisation and tracking of multiple speakers. The binaural audio tracker, however, is usually prone to the errors caused by room reverberation and background noise. To address this limitation, we present a multimodal tracking method by fusing the binaural audio with depth information (from a depth sensor, e.g., Kinect). More specifically, the PHD filtering framework is first applied to t...
Audio-visual tracking of an unknown number of concurrent speakers in 3D is a challenging task, espec...
This paper discusses a system capable of detecting the position of the listener through a head-track...
In immersive and interactive audio-visual content, there is very significant scope for spatial misal...
In object-based spatial audio system, positions of the audio objects (e.g. speakers/talkers or voice...
PhD ThesisThis thesis concerns the problem of target localization and tracking in an indoor environm...
Spatial audio has been studied for several decades, but has seen much renewed interest recently due ...
The robust localization of speech sources is required for a wide range of applications, among them h...
The human auditory system has the striking ability to robustly localize and recognize a specific tar...
Compact multi-sensor platforms are portable and thus desirable for robotics and personal-assistance ...
In this paper, a novel probabilistic Bayesian tracking scheme is proposed and applied to bimodal mea...
Humans can robustly recognize and localize objects by integrating visual and auditory cues. While ma...
International audienceThis paper addresses the issues of detecting and localizing objects in a scene...
In this study, we present a binaural scene analyzer that is able to simultaneously localize, detect ...
International audienceMultiple-speaker tracking is a crucial task for many applications. In real-wor...
This paper discusses a system capable of detecting the position of the listener through a head-track...
Audio-visual tracking of an unknown number of concurrent speakers in 3D is a challenging task, espec...
This paper discusses a system capable of detecting the position of the listener through a head-track...
In immersive and interactive audio-visual content, there is very significant scope for spatial misal...
In object-based spatial audio system, positions of the audio objects (e.g. speakers/talkers or voice...
PhD ThesisThis thesis concerns the problem of target localization and tracking in an indoor environm...
Spatial audio has been studied for several decades, but has seen much renewed interest recently due ...
The robust localization of speech sources is required for a wide range of applications, among them h...
The human auditory system has the striking ability to robustly localize and recognize a specific tar...
Compact multi-sensor platforms are portable and thus desirable for robotics and personal-assistance ...
In this paper, a novel probabilistic Bayesian tracking scheme is proposed and applied to bimodal mea...
Humans can robustly recognize and localize objects by integrating visual and auditory cues. While ma...
International audienceThis paper addresses the issues of detecting and localizing objects in a scene...
In this study, we present a binaural scene analyzer that is able to simultaneously localize, detect ...
International audienceMultiple-speaker tracking is a crucial task for many applications. In real-wor...
This paper discusses a system capable of detecting the position of the listener through a head-track...
Audio-visual tracking of an unknown number of concurrent speakers in 3D is a challenging task, espec...
This paper discusses a system capable of detecting the position of the listener through a head-track...
In immersive and interactive audio-visual content, there is very significant scope for spatial misal...