This paper compares two approaches to lexical compound word reconstruction from a speech recognizer output where compound words are decomposed. The first method has been proposed earlier and uses a dedicated language model that models compound tails in the context of the preceding words and compound heads only in the context of the tail. A novel ap-proach models imaginable compound par-ticle connectors as hidden events and pre-dicts such events using a simple N-gram language model. Experiments on two Estonian speech recognition tasks show that the second approach performs consis-tently better and achieves high accuracy.
International audienceThe essential feature of a lexicon-grammar is that the elementary unit of comp...
Compounding is a highly productive word-formation process in some languages that is often problemati...
Compounding is one of the most productive word formation processes in many languages and is therefor...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
This paper addresses compound splitting for Dutch in the context of broadcast news transcription. La...
This paper addresses compound splitting for Dutch in the context of broadcast news transcription. La...
In the speech recognition of highly inflecting or compounding languages, the traditional word-based ...
International audienceCompounding is present in a large variety of languages in different proportion...
Unlike the English language, languages such as German, Dutch, the Skandinavian languages or Greek fo...
© 2015 IEEE. Compounding is one of the most productive word formation processes in many languages an...
Abstract. We present an approach for knowledge-free and unsuper-vised recognition of compound nouns ...
In Technical Report No. 75 I proposed a method for describing compound words in Finnish. The aim in ...
In this paper we present a novel clustering technique for compound words. By mapping compounds onto ...
In this article we investigate statistical machine translation (SMT) into Germanic languages, with a...
An improved statistical model is proposed in this paper for extracting compound words from a text co...
International audienceThe essential feature of a lexicon-grammar is that the elementary unit of comp...
Compounding is a highly productive word-formation process in some languages that is often problemati...
Compounding is one of the most productive word formation processes in many languages and is therefor...
Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007. Editors: Jo...
This paper addresses compound splitting for Dutch in the context of broadcast news transcription. La...
This paper addresses compound splitting for Dutch in the context of broadcast news transcription. La...
In the speech recognition of highly inflecting or compounding languages, the traditional word-based ...
International audienceCompounding is present in a large variety of languages in different proportion...
Unlike the English language, languages such as German, Dutch, the Skandinavian languages or Greek fo...
© 2015 IEEE. Compounding is one of the most productive word formation processes in many languages an...
Abstract. We present an approach for knowledge-free and unsuper-vised recognition of compound nouns ...
In Technical Report No. 75 I proposed a method for describing compound words in Finnish. The aim in ...
In this paper we present a novel clustering technique for compound words. By mapping compounds onto ...
In this article we investigate statistical machine translation (SMT) into Germanic languages, with a...
An improved statistical model is proposed in this paper for extracting compound words from a text co...
International audienceThe essential feature of a lexicon-grammar is that the elementary unit of comp...
Compounding is a highly productive word-formation process in some languages that is often problemati...
Compounding is one of the most productive word formation processes in many languages and is therefor...