Thesis (Master's)--University of Washington, 2016-12Given the vast amount of textual data that we have available today, it is very beneficial to have an efficient methodology to filter and select important and relevant chunks of this data to improve current natural language and speech processing systems. Although utilizing very large language models has been the industry norm in the current automatic speech recognition production systems, the focus is now shifting towards efficient ways to generate and utilize personalized and adapted language models as they have proven to improve the end user experience. Submodular methods have achieved great success in different domains; acoustic modeling, text summarization, and machine translation. They...
We experiment with subword segmentation approaches that are widely used to address the open vocabula...
Texte intégral accessible uniquement aux membres de l'Université de LorraineOne way to improve perfo...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...
In today's society, speech recognition systems have reached a mass audience, especially in the field...
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, includ...
Automatic speech recognition (ASR) technology has matured over the past few decades and has made sig...
This paper compares schemes for the selection of multi-genre broadcast data and corresponding transc...
The performance of the speech recognition systems to translate voice to text is still an issue in la...
This paper presents an extended study in the topic of optimal selection of speech data from a databa...
One particular problem in large vocabulary continuous speech recognition for low-resourced languages...
Over the past decades, speech recognition has dramatically improved in a large variety of applicatio...
We conduct a comparative study on selecting subsets of acous-tic data for training phone recognizers...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speak...
This research addresses the language model (LM) domain mismatch problem in automatic speech recognit...
We experiment with subword segmentation approaches that are widely used to address the open vocabula...
Texte intégral accessible uniquement aux membres de l'Université de LorraineOne way to improve perfo...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...
In today's society, speech recognition systems have reached a mass audience, especially in the field...
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, includ...
Automatic speech recognition (ASR) technology has matured over the past few decades and has made sig...
This paper compares schemes for the selection of multi-genre broadcast data and corresponding transc...
The performance of the speech recognition systems to translate voice to text is still an issue in la...
This paper presents an extended study in the topic of optimal selection of speech data from a databa...
One particular problem in large vocabulary continuous speech recognition for low-resourced languages...
Over the past decades, speech recognition has dramatically improved in a large variety of applicatio...
We conduct a comparative study on selecting subsets of acous-tic data for training phone recognizers...
Automatic speech recognition (ASR) requires a strong language model to guide the acoustic model and ...
Speaker dependent (SD) ASR systems have significantly lower word error rates (WER) compared to speak...
This research addresses the language model (LM) domain mismatch problem in automatic speech recognit...
We experiment with subword segmentation approaches that are widely used to address the open vocabula...
Texte intégral accessible uniquement aux membres de l'Université de LorraineOne way to improve perfo...
Self-supervised learning (SSL) has been able to leverage unlabeled data to boost the performance of ...