This paper presents WAGS (Word Alignment Gold Standard), a novel benchmark which allows extensive evaluation of WA tools on out-of-vocabulary (OOV) and rare words. WAGS is a subset of the Common Test section of the Europarl English-Italian parallel corpus, and is specifically tailored to OOV and rare words. WAGS is composed of 6,715 sentence pairs containing 11,958 occurrences of OOV and rare words up to frequency 15 in the Europarl Training set (5,080 English words and 6,878 Italian words), representing almost 3% of the whole text. Since WAGS is focused on OOV/rare words, manual alignments are provided for these words only, and not for the whole sentences. Two off-the-shelf word aligners have been evaluated on WAGS, and results have been...
In this paper we build on our methodology for combining and selecting alignment techniques for vocab...
Forced alignment automatically aligns audio recordings of spoken language with transcripts at the le...
International audienceThis work describes the evaluations of two approaches, Lexical Matching and Se...
Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Editors: Bol...
UnrestrictedAll state of the art statistical machine translation systems and many example-based mach...
Using multilingual word embeddings for computing word alignments has been shown to be competetive wi...
Rare word representation has recently enjoyed a surge of interest, owing to the crucial role that ef...
Automatic word alignment is a key step in training statistical machine translation systems. Despite ...
In this paper we present KNOWA, an English/Italian word aligner, developed at ITC-irst, which relies...
In this paper we present KNOWA, an English/Italian word aligner, developed at ITC-irst, which relies...
A Gold Standard Word Alignment for English-Swedish (GES) is a resource containing 1164 manually word...
In this thesis we present the idea of using parallel phrases for word alignment. Each parallel phras...
In this paper we build on our methodology for combining and selecting alignment techniques for vocab...
This paper reports an experience on producing manual word alignments over six different language pai...
In this paper we present a new and simple language-independent method for word-alignment based on th...
In this paper we build on our methodology for combining and selecting alignment techniques for vocab...
Forced alignment automatically aligns audio recordings of spoken language with transcripts at the le...
International audienceThis work describes the evaluations of two approaches, Lexical Matching and Se...
Proceedings of the 18th Nordic Conference of Computational Linguistics NODALIDA 2011. Editors: Bol...
UnrestrictedAll state of the art statistical machine translation systems and many example-based mach...
Using multilingual word embeddings for computing word alignments has been shown to be competetive wi...
Rare word representation has recently enjoyed a surge of interest, owing to the crucial role that ef...
Automatic word alignment is a key step in training statistical machine translation systems. Despite ...
In this paper we present KNOWA, an English/Italian word aligner, developed at ITC-irst, which relies...
In this paper we present KNOWA, an English/Italian word aligner, developed at ITC-irst, which relies...
A Gold Standard Word Alignment for English-Swedish (GES) is a resource containing 1164 manually word...
In this thesis we present the idea of using parallel phrases for word alignment. Each parallel phras...
In this paper we build on our methodology for combining and selecting alignment techniques for vocab...
This paper reports an experience on producing manual word alignments over six different language pai...
In this paper we present a new and simple language-independent method for word-alignment based on th...
In this paper we build on our methodology for combining and selecting alignment techniques for vocab...
Forced alignment automatically aligns audio recordings of spoken language with transcripts at the le...
International audienceThis work describes the evaluations of two approaches, Lexical Matching and Se...