In this article we investigate statistical machine translation (SMT) into Germanic languages, with a focus on compound processing. Our main goal is to enable the generation of novel compounds that have not been seen in the training data. We adopt a split-merge strategy, where compounds are split before training the SMT system, and merged after the translation step. This approach reduces sparsity in the training data, but runs the risk of placing translations of compound parts in non-consecutive positions. It also requires a postprocessing step of compound merging, where compounds are reconstructed in the translation output. We present a method for increasing the chances that components that should be merged are translated into contiguous po...
This paper presents a method to im-prove the translation of polysemous nouns, leveraging on their pr...
In this work, we present a novel compound splitting method for German by capturing the compound prod...
Compounds pose a problem for applications that rely on precise word alignments such as bilingual ter...
In this article we investigate statistical machine translation (SMT) into Germanic languages, with a...
In this article we investigate statistical machine translation (SMT) into Germanic languages, with a...
In this thesis I explore how compound processing can be used to improve phrase-based statistical mac...
Compound splitting is an important problem in many NLP applications which must be solved in order to...
In this article, compound processing for translation into German in a factored sta-tistical MT syste...
Compounding in morphologically rich languages is a highly productive process which often causes SMT ...
The paper presents an approach to morphological compound splitting that takes the degree of composit...
In this thesis I aim to improve phrase-based statistical machine translation (PBSMT) in a number of ...
Compounding is present in a large variety of languages in different proportions. Compound rate in th...
Unlike the English language, languages such as German, Dutch, the Skandinavian languages or Greek fo...
Traditionally, compound splitters are evaluated intrinsically on gold-standard data or extrinsically...
The subject of investigation of this thesis is the building blocks of translation in Statistical Mac...
This paper presents a method to im-prove the translation of polysemous nouns, leveraging on their pr...
In this work, we present a novel compound splitting method for German by capturing the compound prod...
Compounds pose a problem for applications that rely on precise word alignments such as bilingual ter...
In this article we investigate statistical machine translation (SMT) into Germanic languages, with a...
In this article we investigate statistical machine translation (SMT) into Germanic languages, with a...
In this thesis I explore how compound processing can be used to improve phrase-based statistical mac...
Compound splitting is an important problem in many NLP applications which must be solved in order to...
In this article, compound processing for translation into German in a factored sta-tistical MT syste...
Compounding in morphologically rich languages is a highly productive process which often causes SMT ...
The paper presents an approach to morphological compound splitting that takes the degree of composit...
In this thesis I aim to improve phrase-based statistical machine translation (PBSMT) in a number of ...
Compounding is present in a large variety of languages in different proportions. Compound rate in th...
Unlike the English language, languages such as German, Dutch, the Skandinavian languages or Greek fo...
Traditionally, compound splitters are evaluated intrinsically on gold-standard data or extrinsically...
The subject of investigation of this thesis is the building blocks of translation in Statistical Mac...
This paper presents a method to im-prove the translation of polysemous nouns, leveraging on their pr...
In this work, we present a novel compound splitting method for German by capturing the compound prod...
Compounds pose a problem for applications that rely on precise word alignments such as bilingual ter...