Superficially, read and spontaneous speech—the two main kinds of training data for automatic speech recognition—appear as complementary, but are equal: pairs of texts and acoustic signals. Yet, spontaneous speech is typically harder for recognition. This is usually explained by different kinds of variation and noise, but there is a more fundamental deviation at play: for read speech, the audio signal is produced by recitation of the given text, whereas in spontaneous speech, the text is transcribed from a given signal. In this review, we embrace this difference by presenting a first introduction of causal reasoning into automatic speech recognition, and describing causality as a tool to study speaking styles and training data. After breakin...
Although speech, derived from reading texts, and similar types of speech, e.g. that from reading new...
Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, l...
Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, l...
Superficially, read and spontaneous speech—the two main kinds of training data for automatic speech ...
We describe three analyses on the effects of spontaneous speech on continuous speech recognition per...
In automatic speech recognition, a statistical language model (LM) predicts the probability of the n...
Compared to dictation systems, recognition systems for spontaneous speech still perform rather poorl...
Intrinsic variability of the speaker in spontaneous speech remains a challenge to state of the art ...
Intrinsic variability of the speaker in spontaneous speech remains a challenge to state of the art A...
Contains fulltext : 5970.pdf (publisher's version ) (Open Access)Although spontane...
Listeners are typically able to identify speech as either being produced spontaneously or read from ...
In spontaneous speech, speakers segment their speech into intonational phrases, and make repairs to ...
This paper presents analyses, and recognition experiments, on spontaneous American English speech co...
This paper reports various investigations on recognizing spontaneous presentation speech in connecti...
In automatic speech recognition, a stochastic language model (LM) predicts the probability of the ne...
Although speech, derived from reading texts, and similar types of speech, e.g. that from reading new...
Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, l...
Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, l...
Superficially, read and spontaneous speech—the two main kinds of training data for automatic speech ...
We describe three analyses on the effects of spontaneous speech on continuous speech recognition per...
In automatic speech recognition, a statistical language model (LM) predicts the probability of the n...
Compared to dictation systems, recognition systems for spontaneous speech still perform rather poorl...
Intrinsic variability of the speaker in spontaneous speech remains a challenge to state of the art ...
Intrinsic variability of the speaker in spontaneous speech remains a challenge to state of the art A...
Contains fulltext : 5970.pdf (publisher's version ) (Open Access)Although spontane...
Listeners are typically able to identify speech as either being produced spontaneously or read from ...
In spontaneous speech, speakers segment their speech into intonational phrases, and make repairs to ...
This paper presents analyses, and recognition experiments, on spontaneous American English speech co...
This paper reports various investigations on recognizing spontaneous presentation speech in connecti...
In automatic speech recognition, a stochastic language model (LM) predicts the probability of the ne...
Although speech, derived from reading texts, and similar types of speech, e.g. that from reading new...
Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, l...
Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, l...