In this paper, we present advances in the modeling of the masking behavior of the human auditory system (HAS) to enhance the robustness of the feature extraction stage in automatic speech recognition (ASR). The solution adopted is based on a nonlinear filtering of a spectro-temporal representation applied simultaneously to both frequency and time domains-as if it were an image-using mathematical morphology operations. A particularly important component of this architecture is the so-called structuring element (SE) that in the present contribution is designed as a single three-dimensional pattern using physiological facts, in such a way that closely resembles the masking phenomena taking place in the cochlea. A proper choice of spectro-tempo...
Abstract — The Mel-Frequency Cepstral Coefficient (MFCC) or Perceptual Linear Prediction (PLP) featu...
The performance of Mel-frequency cepstrum based automatic speech recognition system significantly de...
Performance of an automatic speech recognition system drops dramatically in the presence of backgrou...
In this paper, we present advances in the modeling of the masking behavior of the human auditory sys...
Proceedings of: 15th Annual Conference of the International Speech Communication Association. Singap...
New auditory-inspired speech processing methods are presented in this paper, combining spectral subt...
Actas de: VII Jornadas en Tecnología del Habla and III Iberian SLTECH Workshop (IberSPEECH 2012). Ma...
This work explores an alternative set of features to the frequently used melfrequency coefficients (...
Mención Internacional en el título de doctorIn spite of the enormous leap forward that the Automatic...
A new approach for speech feature extraction in automatic speech recognition (ASR) is proposed in th...
The human ability to classify acoustic sounds is still unmatched compared to recent methods in machi...
While there have been many attempts to mitigate interferences of background noise, the performance o...
One challenging issue in speaker identification (SID) is to achieve noise-robust performance. Humans...
sitä The effect of bio-inspired spectro-temporal processing for automatic speech recognition (ASR) ...
One of the biggest obstacles that hinder the widespread use of automatic speech recognition technolo...
Abstract — The Mel-Frequency Cepstral Coefficient (MFCC) or Perceptual Linear Prediction (PLP) featu...
The performance of Mel-frequency cepstrum based automatic speech recognition system significantly de...
Performance of an automatic speech recognition system drops dramatically in the presence of backgrou...
In this paper, we present advances in the modeling of the masking behavior of the human auditory sys...
Proceedings of: 15th Annual Conference of the International Speech Communication Association. Singap...
New auditory-inspired speech processing methods are presented in this paper, combining spectral subt...
Actas de: VII Jornadas en Tecnología del Habla and III Iberian SLTECH Workshop (IberSPEECH 2012). Ma...
This work explores an alternative set of features to the frequently used melfrequency coefficients (...
Mención Internacional en el título de doctorIn spite of the enormous leap forward that the Automatic...
A new approach for speech feature extraction in automatic speech recognition (ASR) is proposed in th...
The human ability to classify acoustic sounds is still unmatched compared to recent methods in machi...
While there have been many attempts to mitigate interferences of background noise, the performance o...
One challenging issue in speaker identification (SID) is to achieve noise-robust performance. Humans...
sitä The effect of bio-inspired spectro-temporal processing for automatic speech recognition (ASR) ...
One of the biggest obstacles that hinder the widespread use of automatic speech recognition technolo...
Abstract — The Mel-Frequency Cepstral Coefficient (MFCC) or Perceptual Linear Prediction (PLP) featu...
The performance of Mel-frequency cepstrum based automatic speech recognition system significantly de...
Performance of an automatic speech recognition system drops dramatically in the presence of backgrou...