This paper describes a study in which we compare human and automatic recognition of words in fluent and disfluent spontaneous speech. In a word-level gating study with confidence judgements, we examine how the recognition and confidence of recognition of words by humans develops over utterances and show how disfluency disrupts the process. We give an automatic recogniser the same task and compare its performance with the humans’. With both systems, subsequent context supports word recognition: confidence in word recognition peaks after subsequent words have been heard. With both systems, disfluency adversely affects recognition of words in the immediate vicinity of the disfluent interruption (for repeats and repairs): disrupted subsequent c...
Humans are able to recognise a word before its acoustic realisation is complete. This in contrast to...
Two experiments examined the dynamics of lexical activation in spoken-word recognition. In both, the...
Two experiments examined the dynamics of lexical activation in spoken-word recognition. In both, the...
Three experiments investigated listeners' ability to detect disfluency in spontaneous speech. All em...
Three experiments investigated listeners' ability to detect disfluency in spontaneous speech. All em...
We describe three analyses on the effects of spontaneous speech on continuous speech recognition per...
In automatic speech recognition, a statistical language model (LM) predicts the probability of the n...
Unlike rehearsed and prepared speech, spontaneous speech contains high occurrence of disfluencies, l...
To investigate problems of spontaneous speech recognition using N-grams and HMMs and estimate the ro...
Research in the area of speech perception has shown that prosodic, or suprasegmental, information co...
A Visually Grounded Speech model is a neural model which is trained to embed image caption pairs clo...
An eye-tracking experiment examined contextual flexibility in speech processing in response to disto...
An eye-tracking experiment examined contextual flexibility in speech processing in response to disto...
Thesis (Ph.D.)--University of Washington, 2021Considering the complexity of speech communicatio...
We investigated word recognition in a Visually Grounded Speech model. The model has been trained on ...
Humans are able to recognise a word before its acoustic realisation is complete. This in contrast to...
Two experiments examined the dynamics of lexical activation in spoken-word recognition. In both, the...
Two experiments examined the dynamics of lexical activation in spoken-word recognition. In both, the...
Three experiments investigated listeners' ability to detect disfluency in spontaneous speech. All em...
Three experiments investigated listeners' ability to detect disfluency in spontaneous speech. All em...
We describe three analyses on the effects of spontaneous speech on continuous speech recognition per...
In automatic speech recognition, a statistical language model (LM) predicts the probability of the n...
Unlike rehearsed and prepared speech, spontaneous speech contains high occurrence of disfluencies, l...
To investigate problems of spontaneous speech recognition using N-grams and HMMs and estimate the ro...
Research in the area of speech perception has shown that prosodic, or suprasegmental, information co...
A Visually Grounded Speech model is a neural model which is trained to embed image caption pairs clo...
An eye-tracking experiment examined contextual flexibility in speech processing in response to disto...
An eye-tracking experiment examined contextual flexibility in speech processing in response to disto...
Thesis (Ph.D.)--University of Washington, 2021Considering the complexity of speech communicatio...
We investigated word recognition in a Visually Grounded Speech model. The model has been trained on ...
Humans are able to recognise a word before its acoustic realisation is complete. This in contrast to...
Two experiments examined the dynamics of lexical activation in spoken-word recognition. In both, the...
Two experiments examined the dynamics of lexical activation in spoken-word recognition. In both, the...