The syllables of speech contain information about the vocal tract length (VTL) of the speaker as well as the phonetic message. Ideally, the pre-processor used for automatic speech recognition (ASR) should segregate the phonetic message from the VTL information. This paper describes a method to calculate VTL-invariant auditory feature vectors from speech, using a method in which the message and the VTL are segregated. Spectra produced by an auditory filterbank are summarized by a Gaussian mixture model (GMM) to produce a low-dimensional feature vector. These features are evaluated for robustness in comparison with conventional mel-frequency cepstral coefficients (MFCCs) using a hidden-Markov-model (HMM) recognizer. A dynamic, compressive gam...
Speech recognition is the enabling technology allowing humans to communicate with computers using th...
In this paper, we present advances in the modeling of the masking behavior of the human auditory sys...
Distortions due to reverberation have detrimental effect on the performance of automatic speech reco...
In this paper, we consider the generation of features for automatic speech recognition (ASR) that ar...
This paper proposes a technique of extracting robust feature vectors for ASR. The technique is inspi...
Modern automatic speech recognition (ASR) systems typically use a bank of linear filters as the firs...
Performance of an automatic speech recognition system drops dramatically in the presence of backgrou...
State-of-the-art automatic speech recognition (ASR) systems are significantly inferior to humans esp...
The classical front end analysis in speech recognition is a spectral analysis which parametrizes the...
This work investigates the application of spectral and temporal speech processing algorithms develop...
The accurate extraction of two particular features from the speech signal affected by additive white...
We describe a method to select features for speech recognition that is based on a quantitative model...
Human listeners can identify vowels regardless of speaker size, although the sound waves for an adul...
MFCC (Mel-Frequency Cepstral Coefficients) is a kind of traditional speech feature widely used in sp...
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but t...
Speech recognition is the enabling technology allowing humans to communicate with computers using th...
In this paper, we present advances in the modeling of the masking behavior of the human auditory sys...
Distortions due to reverberation have detrimental effect on the performance of automatic speech reco...
In this paper, we consider the generation of features for automatic speech recognition (ASR) that ar...
This paper proposes a technique of extracting robust feature vectors for ASR. The technique is inspi...
Modern automatic speech recognition (ASR) systems typically use a bank of linear filters as the firs...
Performance of an automatic speech recognition system drops dramatically in the presence of backgrou...
State-of-the-art automatic speech recognition (ASR) systems are significantly inferior to humans esp...
The classical front end analysis in speech recognition is a spectral analysis which parametrizes the...
This work investigates the application of spectral and temporal speech processing algorithms develop...
The accurate extraction of two particular features from the speech signal affected by additive white...
We describe a method to select features for speech recognition that is based on a quantitative model...
Human listeners can identify vowels regardless of speaker size, although the sound waves for an adul...
MFCC (Mel-Frequency Cepstral Coefficients) is a kind of traditional speech feature widely used in sp...
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but t...
Speech recognition is the enabling technology allowing humans to communicate with computers using th...
In this paper, we present advances in the modeling of the masking behavior of the human auditory sys...
Distortions due to reverberation have detrimental effect on the performance of automatic speech reco...