Domain language model (LM) adaptation consists in re-estimating probabilities of a baseline LM to better match the peculiarities of a given broad topic of interest. To do so, a yet common strategy consists in retrieving adaptation texts from the Web based on a given domain representative seed text. In this report, we extensively study this process by analyzing the impact of numerous parameters. The domain adaptation is carried on a set of videos dealing with business and management. The achieved results mainly show which Web querying strategies perform the best and how significantly the supervision level of the adaptation process impacts the overall performances
This paper introduces a selection-based LM using topic modeling for the purpose of domain adaptation...
ASR model deployment environment is ever-changing, and the incoming speech can be switched across di...
Abstract. Statistical Machine Translation (SMT) is currently used in real-time and commercial settin...
Domain adaptation of a language model aims at re-estimating word sequence probabilities in order to ...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Cieľom práce je implementovať systém pre automatickú adaptáciu jazykového modelu pre Phonexia ASR sy...
Differences in domains of language use between training data and test data have often been reported ...
Domain adaptation for machine translation (MT) can be achieved by selecting training instances close...
International audienceThe goal of vocabulary optimization is to construct a vocabulary with exactly ...
The performance of a machine learning model trained on labeled data of a (source) domain degrades se...
Domain adaptation for machine translation (MT) can be achieved by selecting training instances close...
Automatic speech recognition models are often adapted to improve their accuracy in a new domain. A p...
This research addresses the language model (LM) domain mismatch problem in automatic speech recognit...
International audienceWhereas topic-based adaptation of language models (LM) claims to increase the ...
This paper introduces a selection-based LM using topic modeling for the purpose of domain adaptation...
ASR model deployment environment is ever-changing, and the incoming speech can be switched across di...
Abstract. Statistical Machine Translation (SMT) is currently used in real-time and commercial settin...
Domain adaptation of a language model aims at re-estimating word sequence probabilities in order to ...
Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). Thes...
With the fast growth of the amount of digitalized texts in recent years, text information management...
Cieľom práce je implementovať systém pre automatickú adaptáciu jazykového modelu pre Phonexia ASR sy...
Differences in domains of language use between training data and test data have often been reported ...
Domain adaptation for machine translation (MT) can be achieved by selecting training instances close...
International audienceThe goal of vocabulary optimization is to construct a vocabulary with exactly ...
The performance of a machine learning model trained on labeled data of a (source) domain degrades se...
Domain adaptation for machine translation (MT) can be achieved by selecting training instances close...
Automatic speech recognition models are often adapted to improve their accuracy in a new domain. A p...
This research addresses the language model (LM) domain mismatch problem in automatic speech recognit...
International audienceWhereas topic-based adaptation of language models (LM) claims to increase the ...
This paper introduces a selection-based LM using topic modeling for the purpose of domain adaptation...
ASR model deployment environment is ever-changing, and the incoming speech can be switched across di...
Abstract. Statistical Machine Translation (SMT) is currently used in real-time and commercial settin...