To reduce inter-speaker variability, vocal tract length normalization (VTLN) is commonly used to transform acoustic features for automatic speech recognition (ASR). The warp factors used in this process are usually derived by maximum likelihood (ML) estimation, involving an exhaustive search over possible values. We describe an alternative approach: exploit the correlation between a speaker's average pitch and vocal tract length, and model the probability distribution of warp factors conditioned on pitch observations. This can be used directly for warp factor estimation, or as a smoothing prior in combination with ML estimates. Pitch-based warp factor estimation for VTLN is effective and requires relatively little memory and computation. Su...
One of the main problems faced by automatic speech recognition is the variability of the testing con...
Vocal tract normalization (VTN) is an effective way to reduce inter-speaker variability mainly cause...
In most automatic speech recognition (ASR) systems, speaker differences are compensated by normalizi...
Vocal tract length normalization is an important feature normalization technique that can be used to...
Vocal Tract Length Normalisation (VTLN) is a commonly used technique to normalise for inter-speaker...
The advent of statistical speech synthesis has enabled the unification of the basic techniques used ...
Augmenting datasets by transforming inputs in a way that does not change the label is a crucial ingr...
Abstract. Inter-speaker variability, one of the problems faced in speech recognition system, has cau...
One of the main problems faced by automatic speech recognition is the variability of the testing con...
ABSTRACT: This paper investigates the application of Vocal Tract Length Normalisation (VTLN) for rob...
Vocal tract length normalization (VTLN) has been successfully used in automatic speech recognition f...
Artículo de publicación ISIThis paper proposes a novel feature-space VTLN (vocal tract length norma...
Vocal tract length normalisation (VTLN) is a commonly used speaker normalisation approach. It is att...
Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a r...
Vocal tract length normalisation (VTLN) is a well known rapid adaptation technique. VTLN as a linear...
One of the main problems faced by automatic speech recognition is the variability of the testing con...
Vocal tract normalization (VTN) is an effective way to reduce inter-speaker variability mainly cause...
In most automatic speech recognition (ASR) systems, speaker differences are compensated by normalizi...
Vocal tract length normalization is an important feature normalization technique that can be used to...
Vocal Tract Length Normalisation (VTLN) is a commonly used technique to normalise for inter-speaker...
The advent of statistical speech synthesis has enabled the unification of the basic techniques used ...
Augmenting datasets by transforming inputs in a way that does not change the label is a crucial ingr...
Abstract. Inter-speaker variability, one of the problems faced in speech recognition system, has cau...
One of the main problems faced by automatic speech recognition is the variability of the testing con...
ABSTRACT: This paper investigates the application of Vocal Tract Length Normalisation (VTLN) for rob...
Vocal tract length normalization (VTLN) has been successfully used in automatic speech recognition f...
Artículo de publicación ISIThis paper proposes a novel feature-space VTLN (vocal tract length norma...
Vocal tract length normalisation (VTLN) is a commonly used speaker normalisation approach. It is att...
Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a r...
Vocal tract length normalisation (VTLN) is a well known rapid adaptation technique. VTLN as a linear...
One of the main problems faced by automatic speech recognition is the variability of the testing con...
Vocal tract normalization (VTN) is an effective way to reduce inter-speaker variability mainly cause...
In most automatic speech recognition (ASR) systems, speaker differences are compensated by normalizi...