This paper quantifies the value of pronunciation lexicons in large vocabulary continuous speech recognition (LVCSR) systems that support keyword search (KWS) in low resource languages. State-of-the-art LVCSR and KWS systems are developed for conver-sational telephone speech in Tagalog, and the baseline lexicon is augmented via three different grapheme-to-phoneme models that yield increasing coverage of a large Tagalog word-list. It is demon-strated that while the increased lexical coverage — or reduced out-of-vocabulary (OOV) rate — leads to only modest (ca 1%-4%) improvements in word error rate, the concomitant improvements in actual term weighted value are as much as 60%. It is also shown that incorporating the augmented lexicons into the...
State of the art technologies for speech recognition are very accurate for heavily studied languages...
This paper investigates the application of hierarchical MRASTA bottleneck (BN) features for under-re...
Spoken content in languages of emerging importance needs to be searchable to provide access to the u...
International audienceIn this paper we aim to enhance keyword search for conversational telephone sp...
“Phonetic Search Methods for Large Databases” focuses on Keyword Spotting (KWS) within large speech ...
A regular automatic speech recognizer works with a so-called recognition lexicon. This lexicon conta...
This paper examines the impact of multilingual (ML) acoustic representations on Automatic Speech Rec...
This paper presents recent progress in developing speech-to-text (STT) and keyword spotting (KWS) sy...
We describe the use of text data scraped from the web to augment language models for Automatic Speec...
In pursuance of better performance, current speech recognition systems tend to use more and more com...
International audienceThis paper reports on investigations using two techniques for language model t...
International audiencehe research presented in the paper addresses conversational telephone speechre...
This paper presents different methods of handling pronunciation variations in Cantonese large-vocabu...
Recently there has been increased interest in Automatic Speech Recognition (ASR) and Key Word Spotti...
The point process model (PPM) for keyword search is a whole-word parametric modeling framework based...
State of the art technologies for speech recognition are very accurate for heavily studied languages...
This paper investigates the application of hierarchical MRASTA bottleneck (BN) features for under-re...
Spoken content in languages of emerging importance needs to be searchable to provide access to the u...
International audienceIn this paper we aim to enhance keyword search for conversational telephone sp...
“Phonetic Search Methods for Large Databases” focuses on Keyword Spotting (KWS) within large speech ...
A regular automatic speech recognizer works with a so-called recognition lexicon. This lexicon conta...
This paper examines the impact of multilingual (ML) acoustic representations on Automatic Speech Rec...
This paper presents recent progress in developing speech-to-text (STT) and keyword spotting (KWS) sy...
We describe the use of text data scraped from the web to augment language models for Automatic Speec...
In pursuance of better performance, current speech recognition systems tend to use more and more com...
International audienceThis paper reports on investigations using two techniques for language model t...
International audiencehe research presented in the paper addresses conversational telephone speechre...
This paper presents different methods of handling pronunciation variations in Cantonese large-vocabu...
Recently there has been increased interest in Automatic Speech Recognition (ASR) and Key Word Spotti...
The point process model (PPM) for keyword search is a whole-word parametric modeling framework based...
State of the art technologies for speech recognition are very accurate for heavily studied languages...
This paper investigates the application of hierarchical MRASTA bottleneck (BN) features for under-re...
Spoken content in languages of emerging importance needs to be searchable to provide access to the u...