The training process of the translation model in statistical machine translation requires a sentence-aligned parallel corpus of source and target language. Most available parallel corpora are at best document-aligned, so sentence alignment is performed on the document-aligned parallel corpus as a pre-processing step to word alignment and building the phrase translation table. In the process of sentence alignment, some data is discarded for "quality reasons", usually because of N:1 sentence alignments. This work presents a set of rules based on empirical analysis of discourse strategies in data discarded during the alignment process of Europarl data. These rules are developed to split the long sentence in 2:1/1:2 sentence alignments, leading...
Training a state-of-the-art syntax-based statistical machine translation (MT) system to translate fr...
This article presents a method for aligning words between translations, that imposes a composition...
This paper describes a method for the automatic alignment of parallel texts at clause level. The met...
In machine translation, the alignment of corpora has evolved into a mature research area, aimed at p...
UnrestrictedAll state of the art statistical machine translation systems and many example-based mach...
The parameters of statistical translation models are typically estimated from sentence-aligned paral...
When parallel or comparable corpora are harvested from the web, there is typically a tradeoff betwee...
The goal of a machine translation (MT) system is to automatically translate a document written in so...
Statistical Word Alignments represent lexical word-to-word translations between source and target la...
In this paper we describe a statistical tech-nique for aligning sentences with their translations in...
Most statistical machine translation systems employ a word-based alignment model. In this paper we d...
In most statistical machine translation (SMT) systems, bilingual segments are ex-tracted via word al...
The main problems of statistical word alignment lie in the facts that source words can only be align...
which permits unrestricted use, distribution, and reproduction in any medium, provided the original ...
Statistically training a machine translation model requires a parallel corpus contain-ing a huge amo...
Training a state-of-the-art syntax-based statistical machine translation (MT) system to translate fr...
This article presents a method for aligning words between translations, that imposes a composition...
This paper describes a method for the automatic alignment of parallel texts at clause level. The met...
In machine translation, the alignment of corpora has evolved into a mature research area, aimed at p...
UnrestrictedAll state of the art statistical machine translation systems and many example-based mach...
The parameters of statistical translation models are typically estimated from sentence-aligned paral...
When parallel or comparable corpora are harvested from the web, there is typically a tradeoff betwee...
The goal of a machine translation (MT) system is to automatically translate a document written in so...
Statistical Word Alignments represent lexical word-to-word translations between source and target la...
In this paper we describe a statistical tech-nique for aligning sentences with their translations in...
Most statistical machine translation systems employ a word-based alignment model. In this paper we d...
In most statistical machine translation (SMT) systems, bilingual segments are ex-tracted via word al...
The main problems of statistical word alignment lie in the facts that source words can only be align...
which permits unrestricted use, distribution, and reproduction in any medium, provided the original ...
Statistically training a machine translation model requires a parallel corpus contain-ing a huge amo...
Training a state-of-the-art syntax-based statistical machine translation (MT) system to translate fr...
This article presents a method for aligning words between translations, that imposes a composition...
This paper describes a method for the automatic alignment of parallel texts at clause level. The met...