As evidence of a link between the various human communication production domains has become more prominent in the last decade, the field of multimodal speech processing has undergone significant expansion. Many different specialised processing methods have been developed to attempt to analyze and utilize the complex relationship between multimodal data streams. This work uses information extracted from an audiovisual corpus to investigate and assess the correlation between audio and visual features in speech. A number of different feature extraction techniques are assessed, with the intention of identifying the visual technique that maximizes the audiovisual correlation. Additionally, this paper aims to demonstrate that a noisy and reverber...
Speech perception is a bimodal process that involves both auditory and visual inputs. The auditory s...
While everyone has experienced that seeing lip movements may improve speech perception, little is kn...
Seeing the moving face of the talker permits better detection of speech in noise compared to auditor...
The aim of this work is to investigate a selection of audio and visual speech features with the aim ...
The aim of this work is to examine the correlation between audio and visual speech features. The mot...
In this paper, we address the problem of automatic discovery of speech patterns using audio-visual i...
Seeing the talker improves the intelligibility of speech degraded by noise (a visual speech enhancem...
This book presents a summary of the cognitively inspired basis behind multimodal speech enhancement,...
In recent years, the established link between the various human communication production domains has...
This paper investigates the statistical relationship between acoustic and visual speech features for...
Abstract — Visual speech information from the speaker’s mouth region has been successfully shown to ...
This experiment examined whether the visual speech enhancement in the intelligibility of degraded ac...
© 2014 IEEE.The visual modality, deemed to be complementary to the audio modality, has recently been...
Abstract—The visual modality, deemed to be complementary to the audio modality, has recently been ex...
This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, ...
Speech perception is a bimodal process that involves both auditory and visual inputs. The auditory s...
While everyone has experienced that seeing lip movements may improve speech perception, little is kn...
Seeing the moving face of the talker permits better detection of speech in noise compared to auditor...
The aim of this work is to investigate a selection of audio and visual speech features with the aim ...
The aim of this work is to examine the correlation between audio and visual speech features. The mot...
In this paper, we address the problem of automatic discovery of speech patterns using audio-visual i...
Seeing the talker improves the intelligibility of speech degraded by noise (a visual speech enhancem...
This book presents a summary of the cognitively inspired basis behind multimodal speech enhancement,...
In recent years, the established link between the various human communication production domains has...
This paper investigates the statistical relationship between acoustic and visual speech features for...
Abstract — Visual speech information from the speaker’s mouth region has been successfully shown to ...
This experiment examined whether the visual speech enhancement in the intelligibility of degraded ac...
© 2014 IEEE.The visual modality, deemed to be complementary to the audio modality, has recently been...
Abstract—The visual modality, deemed to be complementary to the audio modality, has recently been ex...
This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, ...
Speech perception is a bimodal process that involves both auditory and visual inputs. The auditory s...
While everyone has experienced that seeing lip movements may improve speech perception, little is kn...
Seeing the moving face of the talker permits better detection of speech in noise compared to auditor...