In speech recognition there has been a trend to incorporate more and more knowledge about human hearing into the feature extraction step. One such approach is the application of localized spectro-temporal analysis, which is inspired by neurophysiological studies. Here we experiment with extracting features from the patches of the widely used criticial-band log-energy spectrum by applying the two-dimensional cosine transform. Compared to earlier similar studies with the spectrogram representation, we find that our method is not worse, and faster. In experiments with noisy speech the proposed representation proves more noise-robust than the conventional mel-frequency cepstral features.status: publishe
very speech recognition system requires a signal representation that parametrically models the tempo...
This paper introduces a novel set of non-linear spectro-temporal features that improve automatic spe...
This work investigates the application of spectral and temporal speech processing algorithms develop...
Localized spectro-temporal analysis is a novel feature extraction strategy in speech recognition, wh...
Although noise robust automatic speech recognition (ASR) has been a topic of intensive research, to ...
Recent results from physiological and psychoacoustic studies indicate that spectrally and temporally...
Kovács G., ''Noise robust automatic speech recognition based on spectro-temporal techniques'', Proef...
In this work, a first approach to a robust phoneme recognition task by means of a biologically inspi...
This thesis presents a study of alternative speech feature extraction methods aimed at increasing ro...
We introduce the problem of phone classification in the context of speech recognition, and explore s...
The performance of Mel-frequency cepstrum based automatic speech recognition system significantly de...
The speech signal is inherently characterized by its variations in time, which get reflected as vari...
In this paper, we present advances in the modeling of the masking behavior of the human auditory sys...
Recognition of reverberant speech constitutes a challenging problem for typical speech recognition s...
Non-negative spectral factorisation has been used successfully for separation of speech and noise in...
very speech recognition system requires a signal representation that parametrically models the tempo...
This paper introduces a novel set of non-linear spectro-temporal features that improve automatic spe...
This work investigates the application of spectral and temporal speech processing algorithms develop...
Localized spectro-temporal analysis is a novel feature extraction strategy in speech recognition, wh...
Although noise robust automatic speech recognition (ASR) has been a topic of intensive research, to ...
Recent results from physiological and psychoacoustic studies indicate that spectrally and temporally...
Kovács G., ''Noise robust automatic speech recognition based on spectro-temporal techniques'', Proef...
In this work, a first approach to a robust phoneme recognition task by means of a biologically inspi...
This thesis presents a study of alternative speech feature extraction methods aimed at increasing ro...
We introduce the problem of phone classification in the context of speech recognition, and explore s...
The performance of Mel-frequency cepstrum based automatic speech recognition system significantly de...
The speech signal is inherently characterized by its variations in time, which get reflected as vari...
In this paper, we present advances in the modeling of the masking behavior of the human auditory sys...
Recognition of reverberant speech constitutes a challenging problem for typical speech recognition s...
Non-negative spectral factorisation has been used successfully for separation of speech and noise in...
very speech recognition system requires a signal representation that parametrically models the tempo...
This paper introduces a novel set of non-linear spectro-temporal features that improve automatic spe...
This work investigates the application of spectral and temporal speech processing algorithms develop...