Despite being studied extensively, the performance of blind source separation (BSS) is still limited especially for the sensor data collected in adverse environments. Recent studies show that such an issue can be mitigated by incorporating multimodal information into the BSS process. In this paper, we propose a method for the enhancement of the target speech separated by a BSS algorithm from sound mixtures, using visual voice activity detection (VAD) and spectral subtraction. First, a classifier for visual VAD is formed in the off-line training stage, using labelled features extracted from the visual stimuli. Then we use this visual VAD classifier to detect the voice activity of the target speech. Finally we apply a multi-band spectral subt...
Because it can be found in many applications, the Blind Separation of Sources (BSS) problem has rais...
Speech separation is the task of segregating a target speech signal from background interference. To...
In this paper we present two novel methods for visual voice activity detection (V-VAD) which exploit...
Despite being studied extensively, the performance of blind source separation (BSS) is still limited...
© 2014 IEEE.The visual modality, deemed to be complementary to the audio modality, has recently been...
Abstract—The visual modality, deemed to be complementary to the audio modality, has recently been ex...
International audienceAudio–visual speech source separation consists in mixing visual speech process...
International audienceAudio-visual speech source separation consists in mixing visual speech process...
Current voice activity detection methods generally utilise only acoustic information. Therefore they...
The aim of this work is to utilize both audio and visual speech information to create a robust voice...
The aim of this work is to utilize both audio and visual speech information to create a robust voice...
Abstract—Voice Activity Detection (VAD) refers to the problem of distinguishing speech segments from...
Human can extract speech signals that they need to understand from a mixture of background noise, in...
Spontaneous speech in videos capturing the speaker's mouth provides bimodal information. Exploiting ...
Humans with normal hearing ability are generally skilful in listening selectively to a particular sp...
Because it can be found in many applications, the Blind Separation of Sources (BSS) problem has rais...
Speech separation is the task of segregating a target speech signal from background interference. To...
In this paper we present two novel methods for visual voice activity detection (V-VAD) which exploit...
Despite being studied extensively, the performance of blind source separation (BSS) is still limited...
© 2014 IEEE.The visual modality, deemed to be complementary to the audio modality, has recently been...
Abstract—The visual modality, deemed to be complementary to the audio modality, has recently been ex...
International audienceAudio–visual speech source separation consists in mixing visual speech process...
International audienceAudio-visual speech source separation consists in mixing visual speech process...
Current voice activity detection methods generally utilise only acoustic information. Therefore they...
The aim of this work is to utilize both audio and visual speech information to create a robust voice...
The aim of this work is to utilize both audio and visual speech information to create a robust voice...
Abstract—Voice Activity Detection (VAD) refers to the problem of distinguishing speech segments from...
Human can extract speech signals that they need to understand from a mixture of background noise, in...
Spontaneous speech in videos capturing the speaker's mouth provides bimodal information. Exploiting ...
Humans with normal hearing ability are generally skilful in listening selectively to a particular sp...
Because it can be found in many applications, the Blind Separation of Sources (BSS) problem has rais...
Speech separation is the task of segregating a target speech signal from background interference. To...
In this paper we present two novel methods for visual voice activity detection (V-VAD) which exploit...