In most automatic speech recognition (ASR) systems, speaker differences are compensated by normalizing the vocal tract lengths of the speakers. This is implemented by warping the frequency-axis by appropriate warping factor. However, it is computationally expensive to find warping factor for each speaker. This problem is overcome by incorporating a universal warping function for all the speakers. Different psychoacoustic scales have been proposed over the past decade that are assumed to be similar to the frequency response of basilarmembrane (BM) of human auditory system. In this paper, different warping functions are studied with an aim of vocal tract length normalization (VTLN) and template matching experiments are done using dynamic time...
This paper presents pre-processing of input features to artificial neural network (NN). This is for ...
Accuracy of speaker verification is high under controlled condi-tions but falls off rapidly in the p...
A proven method for achieving effective automatic speech recognition (ASR) due to speaker difference...
In most automatic speech recognition (ASR) systems, speaker differences are compensated by normalizi...
Vocal Tract Length Normalization (VTLN) for standard filterbank-based Mel Frequency Cepstral Coeffic...
The overall success of automatic speech recognition (ASR) depends on efficient phoneme recognition p...
Augmenting datasets by transforming inputs in a way that does not change the label is a crucial ingr...
Vocal Tract Length Normalization (VTLN) has been shown to be an efficient speaker normalization tool...
This paper presents speaker normalization approaches for audio search task. Conventional state-of-th...
The advent of statistical speech synthesis has enabled the unification of the basic techniques used ...
panchap @ icsl.ucla.edu A novel linear transform (LT) is proposed for frequency warp-ing (FW) with s...
In speech recognition, speaker-dependence of a speech recognition system comes from speaker-dependen...
Voice transformation, for example, from a male speaker to a female speaker, is achieved here using a...
Artículo de publicación ISIThis paper proposes a novel feature-space VTLN (vocal tract length norma...
To reduce inter-speaker variability, vocal tract length normalization (VTLN) is commonly used to tra...
This paper presents pre-processing of input features to artificial neural network (NN). This is for ...
Accuracy of speaker verification is high under controlled condi-tions but falls off rapidly in the p...
A proven method for achieving effective automatic speech recognition (ASR) due to speaker difference...
In most automatic speech recognition (ASR) systems, speaker differences are compensated by normalizi...
Vocal Tract Length Normalization (VTLN) for standard filterbank-based Mel Frequency Cepstral Coeffic...
The overall success of automatic speech recognition (ASR) depends on efficient phoneme recognition p...
Augmenting datasets by transforming inputs in a way that does not change the label is a crucial ingr...
Vocal Tract Length Normalization (VTLN) has been shown to be an efficient speaker normalization tool...
This paper presents speaker normalization approaches for audio search task. Conventional state-of-th...
The advent of statistical speech synthesis has enabled the unification of the basic techniques used ...
panchap @ icsl.ucla.edu A novel linear transform (LT) is proposed for frequency warp-ing (FW) with s...
In speech recognition, speaker-dependence of a speech recognition system comes from speaker-dependen...
Voice transformation, for example, from a male speaker to a female speaker, is achieved here using a...
Artículo de publicación ISIThis paper proposes a novel feature-space VTLN (vocal tract length norma...
To reduce inter-speaker variability, vocal tract length normalization (VTLN) is commonly used to tra...
This paper presents pre-processing of input features to artificial neural network (NN). This is for ...
Accuracy of speaker verification is high under controlled condi-tions but falls off rapidly in the p...
A proven method for achieving effective automatic speech recognition (ASR) due to speaker difference...