The present contribution aims at increasing our understanding of automatic speech recognition (ASR) errors involving frequent homophone or almost homophone words by confronting them to perceptual results. The long-term aim is to improve acoustic modelling of these items to reduce automatic transcription errors. A first question of interest is whether homophone words such as et, (and) and est (to be), for which ASR systems rely on language model weights, can be discriminated in a perceptual transcription test with similar n-gram constraints. A second question concerns the acoustic separability of the two homophone words using appropriate acoustic and prosodic attributes. The perceptual test reveals that even though automatic and perceptual e...
We explore the use of machine learning techniques (notably SVM classifiers and Conditional Random Fi...
International audienceDespite the lack of clear word boundaries in spoken language, the human abilit...
International audienceWe designed two experiments that tested the listeners' perceptual capacities d...
The present contribution aims at increasing our understanding of automatic speech recognition (ASR) ...
International audienceIt is widely acknowledged that human listeners significantly outperform machin...
This thesis focuses on acoustic and prosodic (fundamental frequency (F0), duration, intensity) analy...
It is well-known that human listeners significantly outperform machines when it comes to transcribin...
This paper concerns the study of information derived from the melodic, temporal and intensity charac...
Automatic speech recognition (ASR) systems currently reach enough performance to be integrated in va...
International audienceNative listeners process and understand homophones, such as la locution ‘the p...
International audienceThis study explores automatic speech recognition (ASR) errors from a syntax-pr...
In this paper we propose a multi-step system for the semiautomatic detection and annotation of disfl...
We explore the use of machine learning techniques (notably SVM classifiers and Conditional Random Fi...
International audienceDespite the lack of clear word boundaries in spoken language, the human abilit...
International audienceWe designed two experiments that tested the listeners' perceptual capacities d...
The present contribution aims at increasing our understanding of automatic speech recognition (ASR) ...
International audienceIt is widely acknowledged that human listeners significantly outperform machin...
This thesis focuses on acoustic and prosodic (fundamental frequency (F0), duration, intensity) analy...
It is well-known that human listeners significantly outperform machines when it comes to transcribin...
This paper concerns the study of information derived from the melodic, temporal and intensity charac...
Automatic speech recognition (ASR) systems currently reach enough performance to be integrated in va...
International audienceNative listeners process and understand homophones, such as la locution ‘the p...
International audienceThis study explores automatic speech recognition (ASR) errors from a syntax-pr...
In this paper we propose a multi-step system for the semiautomatic detection and annotation of disfl...
We explore the use of machine learning techniques (notably SVM classifiers and Conditional Random Fi...
International audienceDespite the lack of clear word boundaries in spoken language, the human abilit...
International audienceWe designed two experiments that tested the listeners' perceptual capacities d...