We conduct a comparative study on selecting subsets of acous-tic data for training phone recognizers. The data selection problem is approached as a constrained submodular optimiza-tion problem. Previous applications of this approach required transcriptions or acoustic models trained in a supervised way. In this paper we develop and evaluate a novel and entirely unsupervised approach, and apply it to TIMIT data. Results show that our method consistently outperforms a number of baseline methods while being computationally very efficient and requiring no labeling. Index Terms — speech processing, automatic speech recognition, machine learning 1
This paper compares schemes for the selection of multi-genre broadcast data and corresponding transc...
International audienceTraining a speech recognition system needs audio data and their corresponding ...
Development of an ASR application such as a speech-oriented guidance system for a real environment i...
We conduct a comparative study on selecting subsets of acous-tic data for training phone recognizers...
We apply methods for selecting subsets of dimensions from high-dimensional score spaces, and subsets...
We apply submodular function based docu-ment summarization methods to the problem of subselecting sa...
Thesis (Master's)--University of Washington, 2016-12Given the vast amount of textual data that we ha...
We study two key issues in task-independent training, namely selection of a universal set of subword...
In this paper we construct a data set for semi-supervised acous-tic model training by selecting spok...
Self-supervised speech recognition models require considerable labeled training data for learning hi...
This paper presents a data selection approach where spoken ut-terances are selected in a sequential ...
INTERSPEECH2006: the 9th International Conference on Spoken Language Processing (ICSLP), September 1...
We introduce submodular optimization to the problem of training data subset selection for statistica...
Large multi-speaker datasets for TTS typically contain diverse speakers, recording conditions, style...
This work is intended to explore the performance of a new set of acoustic model units in speech reco...
This paper compares schemes for the selection of multi-genre broadcast data and corresponding transc...
International audienceTraining a speech recognition system needs audio data and their corresponding ...
Development of an ASR application such as a speech-oriented guidance system for a real environment i...
We conduct a comparative study on selecting subsets of acous-tic data for training phone recognizers...
We apply methods for selecting subsets of dimensions from high-dimensional score spaces, and subsets...
We apply submodular function based docu-ment summarization methods to the problem of subselecting sa...
Thesis (Master's)--University of Washington, 2016-12Given the vast amount of textual data that we ha...
We study two key issues in task-independent training, namely selection of a universal set of subword...
In this paper we construct a data set for semi-supervised acous-tic model training by selecting spok...
Self-supervised speech recognition models require considerable labeled training data for learning hi...
This paper presents a data selection approach where spoken ut-terances are selected in a sequential ...
INTERSPEECH2006: the 9th International Conference on Spoken Language Processing (ICSLP), September 1...
We introduce submodular optimization to the problem of training data subset selection for statistica...
Large multi-speaker datasets for TTS typically contain diverse speakers, recording conditions, style...
This work is intended to explore the performance of a new set of acoustic model units in speech reco...
This paper compares schemes for the selection of multi-genre broadcast data and corresponding transc...
International audienceTraining a speech recognition system needs audio data and their corresponding ...
Development of an ASR application such as a speech-oriented guidance system for a real environment i...