ABSTRACT We present the development of a modular system for flexible human-computer interaction via speech. The speech recognition component integrates acoustic and visual information (automatic lip-rcading) improving overall recognition, especially in noisy en vironments. The image of the lips, constituting the visual input, is automatically extracted from the camera picture of the speaker's face by the lip locator module. Finally, the speaker's face is au tomatically acquired and followed by the face tracker sub-system. Integration of the three functions results in thc first bi-modal speech recognizer allowing the speaker reasonable freedom of movement within a possibly noisy room while continuing to communicate with the compute...
EUROSPEECH1997: the 5th European Conference on Speech Communication and Technology , September 22-25...
The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations...
Abstract. This paper presents a speaker-independent audio-visual digit recognition system that utili...
In the last two decades we witnessed a rapid increase of the computational power governed by Moore's...
In the last two decades we witnessed a rapid increase of the computational power governed by Moore's...
261 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1984.Automatic recognition of the ...
Deaf or hard-of-hearing people mostly rely on lip-reading to understand speech. They demonstrate the...
Computers have become more pervasive than ever with a wide range of devices and multiple ways of int...
Automatic speechreading systems use both acoustic and visual signals to perform speech recognition. ...
This thesis describes how an automatic lip reader was realized. Visual speech recognition is a preco...
When combined with acoustical speech information, visual speech information (lip movement) significa...
When combined with acoustical speech information, visual speech information (lip movement) significa...
In this article a complete audio-visual speech recognition system suitable for embedded devices is p...
Speech has information more than text, but under noisy environment speech sufferance from disadvanta...
Speech has information more than text, but under noisy environment speech sufferance from disadvanta...
EUROSPEECH1997: the 5th European Conference on Speech Communication and Technology , September 22-25...
The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations...
Abstract. This paper presents a speaker-independent audio-visual digit recognition system that utili...
In the last two decades we witnessed a rapid increase of the computational power governed by Moore's...
In the last two decades we witnessed a rapid increase of the computational power governed by Moore's...
261 p.Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 1984.Automatic recognition of the ...
Deaf or hard-of-hearing people mostly rely on lip-reading to understand speech. They demonstrate the...
Computers have become more pervasive than ever with a wide range of devices and multiple ways of int...
Automatic speechreading systems use both acoustic and visual signals to perform speech recognition. ...
This thesis describes how an automatic lip reader was realized. Visual speech recognition is a preco...
When combined with acoustical speech information, visual speech information (lip movement) significa...
When combined with acoustical speech information, visual speech information (lip movement) significa...
In this article a complete audio-visual speech recognition system suitable for embedded devices is p...
Speech has information more than text, but under noisy environment speech sufferance from disadvanta...
Speech has information more than text, but under noisy environment speech sufferance from disadvanta...
EUROSPEECH1997: the 5th European Conference on Speech Communication and Technology , September 22-25...
The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations...
Abstract. This paper presents a speaker-independent audio-visual digit recognition system that utili...