<p>Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speaker independent (SI) systems. However, SD systems require sufficient training data from the target speaker, which is impractical to collect in a short time. We present a technique for training SD models using just few minutes of speaker's data. We compensate for the lack of adequate speaker-specific data by selecting neighbours from a database of existing speakers who are acoustically close to the target speaker. These neighbours provide ample training data, which is used to adapt the SI model to obtain an initial SD model for the new speaker with significantly lower WER. We evaluate various neighbour selection algorithms on a large-scale m...
Several adaptation approaches have been proposed in an eort to improve the speech recognition perfor...
(Now with TEMIC SDS GmbH, Ulm, Germany). It has been demonstrated repeatedly that the acoustic model...
This paper shows the results achieved by the Maxi-mum A Posteriori (MAP) speaker adaptation method i...
Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speak...
Traditional text independent speaker recognition systems are based on Gaussian Mixture Models (GMMs)...
Linear regression based speaker adaptation approaches can improve Automatic Speech Recognition (ASR)...
The performance of the speech recognition systems to translate voice to text is still an issue in la...
This paper is concerned with automatic speech recognition (ASR) for accented speech. Given a small a...
LVCSR performance is consistently poor on low-proficiency non-native speech. While gains from speake...
Automatic speech recognition (ASR) in the educational environment could be a solution to address the...
This paper investigates techniques to compensate for the effects of regional accents of British Engl...
Automatic speech recognition (ASR) technology has matured over the past few decades and has made sig...
Inter-speaker variation can be coped rather well in speech recognition by speaker adaptation techniq...
INTERSPEECH2007: 8th Annual Conference of the International Speech Communication Association, August...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...
Several adaptation approaches have been proposed in an eort to improve the speech recognition perfor...
(Now with TEMIC SDS GmbH, Ulm, Germany). It has been demonstrated repeatedly that the acoustic model...
This paper shows the results achieved by the Maxi-mum A Posteriori (MAP) speaker adaptation method i...
Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speak...
Traditional text independent speaker recognition systems are based on Gaussian Mixture Models (GMMs)...
Linear regression based speaker adaptation approaches can improve Automatic Speech Recognition (ASR)...
The performance of the speech recognition systems to translate voice to text is still an issue in la...
This paper is concerned with automatic speech recognition (ASR) for accented speech. Given a small a...
LVCSR performance is consistently poor on low-proficiency non-native speech. While gains from speake...
Automatic speech recognition (ASR) in the educational environment could be a solution to address the...
This paper investigates techniques to compensate for the effects of regional accents of British Engl...
Automatic speech recognition (ASR) technology has matured over the past few decades and has made sig...
Inter-speaker variation can be coped rather well in speech recognition by speaker adaptation techniq...
INTERSPEECH2007: 8th Annual Conference of the International Speech Communication Association, August...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...
Several adaptation approaches have been proposed in an eort to improve the speech recognition perfor...
(Now with TEMIC SDS GmbH, Ulm, Germany). It has been demonstrated repeatedly that the acoustic model...
This paper shows the results achieved by the Maxi-mum A Posteriori (MAP) speaker adaptation method i...