We report on investigations, conducted at the 2006 Johns HopkinsWorkshop, into the use of articulatory features (AFs) for observation and pronunciation models in speech recognition. In the area of observation modeling, we use the outputs of AF classiers both directly, in an extension of hybrid HMM/neural network models, and as part of the observation vector, an extension of the tandem approach. In the area of pronunciation modeling, we investigate a model having multiple streams of AF states with soft synchrony constraints, for both audio-only and audio-visual recognition. The models are implemented as dynamic Bayesian networks, and tested on tasks from the Small-Vocabulary Switchboard (SVitchboard) corpus and the CUAVE audio-visual digits ...
Speech recognition has become common in many application domains. Incorporating acoustic-phonetic kn...
The so-called tandem approach, where the posteriors of a multilayer perceptron (MLP) classifier are ...
Phonological studies suggest that the typical subword units such as phones or phonemes used in autom...
We report on investigations, conducted at the 2006 JHU Summer Workshop, of the use of articulatory f...
We describe a dynamic Bayesian network for articulatory feature recognition. The model is intended t...
This paper describes the use of dynamic Bayesian networks for the task of articulatory feature recog...
Artificial neural networks (ANN) have proven to be well suited to the task of articulatory feature (...
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer S...
Although much is known about how speech is produced, and research into speech production has resulte...
We study the problem of automatic visual speech recognition (VSR) using dynamic Bayesian network (DB...
The ultimate goal of our research is to develop a computational model of human speech recognition th...
The ultimate goal of our research is to develop a computational model of human speech recognition th...
We propose that using a continuous trajectory model to describe an articulatory-based feature set wi...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Speech recognition has become common in many application domains. Incorporating acoustic-phonetic kn...
Speech recognition has become common in many application domains. Incorporating acoustic-phonetic kn...
The so-called tandem approach, where the posteriors of a multilayer perceptron (MLP) classifier are ...
Phonological studies suggest that the typical subword units such as phones or phonemes used in autom...
We report on investigations, conducted at the 2006 JHU Summer Workshop, of the use of articulatory f...
We describe a dynamic Bayesian network for articulatory feature recognition. The model is intended t...
This paper describes the use of dynamic Bayesian networks for the task of articulatory feature recog...
Artificial neural networks (ANN) have proven to be well suited to the task of articulatory feature (...
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer S...
Although much is known about how speech is produced, and research into speech production has resulte...
We study the problem of automatic visual speech recognition (VSR) using dynamic Bayesian network (DB...
The ultimate goal of our research is to develop a computational model of human speech recognition th...
The ultimate goal of our research is to develop a computational model of human speech recognition th...
We propose that using a continuous trajectory model to describe an articulatory-based feature set wi...
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer...
Speech recognition has become common in many application domains. Incorporating acoustic-phonetic kn...
Speech recognition has become common in many application domains. Incorporating acoustic-phonetic kn...
The so-called tandem approach, where the posteriors of a multilayer perceptron (MLP) classifier are ...
Phonological studies suggest that the typical subword units such as phones or phonemes used in autom...