This paper presents a self-supervised method for visual detection of the active speaker in a multi-person spoken interaction scenario. Active speaker detection is a fundamental prerequisite for any artificial cognitive system attempting to acquire language in social settings. The proposed method is intended to complement the acoustic detection of the active speaker, thus improving the system robustness in noisy conditions. The method can detect an arbitrary number of possibly overlapping active speakers based exclusively on visual information about their face. Furthermore, the method does not rely on external annotations, thus complying with cognitive development. Instead, the method uses information from the auditory modality to support le...
The goal of this dissertation is to study and develop approaches to automating detection of social s...
Human can extract speech signals that they need to understand from a mixture of background noise, in...
We present a novel speaker diarization method by using eye-gaze information in multi-party conversat...
This paper presents a self-supervised method for visual detection of the active speaker in a multi-p...
This paper presents a self-supervised method for visual detection of the active speaker in a multi-p...
This paper presents a self-supervised method for visual detection of the active speaker in a multi-p...
© 2016 ACM. In this work, we show how to co-Train a classifier for active speaker detection using au...
International audienceMeetings are a common activity in professional contexts, and it remains challe...
International audienceMeetings are a common activity in professional contexts, and it remains challe...
International audienceMeetings are a common activity in professional contexts, and it remains challe...
Abstract — This paper extends the affective computing re-search field by introducing first-person vi...
Chakravarty P., Zegers J., Tuytelaars T., Van hamme H., ''Active speaker detection with audio-visual...
—Active speaker detection (ASD) is a multi-modal task that aims to identify who, if anyone, is speak...
One of the main issues within the field of social robotics is to endow robots with the ability to di...
We develop and evaluate models for automatic vision-based voice activity detection (VAD) in multipar...
The goal of this dissertation is to study and develop approaches to automating detection of social s...
Human can extract speech signals that they need to understand from a mixture of background noise, in...
We present a novel speaker diarization method by using eye-gaze information in multi-party conversat...
This paper presents a self-supervised method for visual detection of the active speaker in a multi-p...
This paper presents a self-supervised method for visual detection of the active speaker in a multi-p...
This paper presents a self-supervised method for visual detection of the active speaker in a multi-p...
© 2016 ACM. In this work, we show how to co-Train a classifier for active speaker detection using au...
International audienceMeetings are a common activity in professional contexts, and it remains challe...
International audienceMeetings are a common activity in professional contexts, and it remains challe...
International audienceMeetings are a common activity in professional contexts, and it remains challe...
Abstract — This paper extends the affective computing re-search field by introducing first-person vi...
Chakravarty P., Zegers J., Tuytelaars T., Van hamme H., ''Active speaker detection with audio-visual...
—Active speaker detection (ASD) is a multi-modal task that aims to identify who, if anyone, is speak...
One of the main issues within the field of social robotics is to endow robots with the ability to di...
We develop and evaluate models for automatic vision-based voice activity detection (VAD) in multipar...
The goal of this dissertation is to study and develop approaches to automating detection of social s...
Human can extract speech signals that they need to understand from a mixture of background noise, in...
We present a novel speaker diarization method by using eye-gaze information in multi-party conversat...