The increase in the number of multimedia applications that require robust speech recognition systems determined a large interest in the study of audio-visual speech recognition (AVSR) systems. The use of visual features in AVSR is justified by both the audio and visual modality of the speech generation and the need for features that are invariant to acoustic noise perturbation. The speaker in-dependent audio-visual continuous speech recognition system pre-sented in this paper relies on a robust set of visual features obtained from the accurate detection and tracking of the mouth region. Fur-ther, the visual and acoustic observation sequences are integrated using a coupled hidden Markov (CHMM) model. The statistical properties of the CHMM ca...
Abstract. In this paper an evaluation of visual speech features is performed specifically for the ta...
In this thesis, a number of important issues relating to the use of both audio and video information...
Humans are often able to compensate for noise degradation and uncertainty in speech information by a...
With the increase in the computational complexity of recent computers, audio-visual speech recogniti...
We address the problem of robust lip tracking, visual speech feature extraction, and sensor integrat...
This paper describes a complete system for audio-visual recognition of continuous speech including r...
Abstract—This paper presents the design and evaluation of a speaker-independent audio-visual speech ...
The use of visual features in audio-visual speech recognition (AVSR) is justified by both the speech...
We address the problem of robust lip tracking, visual speech feature extraction, and sensor integrat...
Abstract — Visual speech information from the speaker’s mouth region has been successfully shown to ...
The use of visual features in audio-visual speech recognition (AVSR) is justified by both the speec...
Extending automatic speech recognition (ASR) to the vi sual modality has been shown to greatly incre...
Speechreading increases intelligibility in human speech perception. This suggests that conventional ...
Speechreading increases intelligibility in human speech perception. This suggests that conventional ...
Despite significant advances in the area of Automatic Speech Recognition, (ASR) systems still resul...
Abstract. In this paper an evaluation of visual speech features is performed specifically for the ta...
In this thesis, a number of important issues relating to the use of both audio and video information...
Humans are often able to compensate for noise degradation and uncertainty in speech information by a...
With the increase in the computational complexity of recent computers, audio-visual speech recogniti...
We address the problem of robust lip tracking, visual speech feature extraction, and sensor integrat...
This paper describes a complete system for audio-visual recognition of continuous speech including r...
Abstract—This paper presents the design and evaluation of a speaker-independent audio-visual speech ...
The use of visual features in audio-visual speech recognition (AVSR) is justified by both the speech...
We address the problem of robust lip tracking, visual speech feature extraction, and sensor integrat...
Abstract — Visual speech information from the speaker’s mouth region has been successfully shown to ...
The use of visual features in audio-visual speech recognition (AVSR) is justified by both the speec...
Extending automatic speech recognition (ASR) to the vi sual modality has been shown to greatly incre...
Speechreading increases intelligibility in human speech perception. This suggests that conventional ...
Speechreading increases intelligibility in human speech perception. This suggests that conventional ...
Despite significant advances in the area of Automatic Speech Recognition, (ASR) systems still resul...
Abstract. In this paper an evaluation of visual speech features is performed specifically for the ta...
In this thesis, a number of important issues relating to the use of both audio and video information...
Humans are often able to compensate for noise degradation and uncertainty in speech information by a...