International audienceThis paper focuses on two aspects of Machine Translation: parallel corpora and translation model. First, we present a method to automatically build parallel corpora from subtitle files. We use subtitle files gathered from the Internet. This leads to useful data for Subtitling Machine Translation. Our method is based on Dynamic Time Warping. We evaluated this alignment method by comparing it with a sample aligned by hand and we obtained a precision of alignment equal to $0.92$. Second, we use the notion of inter-lingual triggers in order to build from the subtitle parallel corpora multilingual dictionaries and translation tables for machine translation. Inter-lingual triggers allow to detect couple of source and target ...
Parallel corpora extracted from online repositories of movie and TV subtitles are employed in a wide...
In this paper, we present the idea of cross-lingual triggers. We exploit this formalism in order to ...
This paper describes an approach based on word alignment on parallel corpora, which aims at facilita...
International audienceThis paper proposes to use DTW to construct parallel corpora from difficult da...
Abstract. This paper describes a methodology for constructing aligned Ger-man-Chinese corpora from m...
Abstract. This paper proposes to use DTW to construct parallel corpora from difficult data. Parallel...
International audienceIn this paper, we present the idea of cross-lingual triggers. We exploit this ...
SUMAT is a project funded through the EU ICT Policy Support Programme (2011–2014). It involves four ...
This paper presents a method for compiling a large-scale bilingual corpus from a database of movie s...
<p>In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual sub...
In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual subtit...
In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual subtit...
International audienceIn this paper, we propose a new phrase-based translation model based on inter-...
Due to the lack of ideal resources, few researchers have investigated how to improve the machine tra...
Due to the lack of ideal resources, few researchers have investigated how to improve the machine tra...
Parallel corpora extracted from online repositories of movie and TV subtitles are employed in a wide...
In this paper, we present the idea of cross-lingual triggers. We exploit this formalism in order to ...
This paper describes an approach based on word alignment on parallel corpora, which aims at facilita...
International audienceThis paper proposes to use DTW to construct parallel corpora from difficult da...
Abstract. This paper describes a methodology for constructing aligned Ger-man-Chinese corpora from m...
Abstract. This paper proposes to use DTW to construct parallel corpora from difficult data. Parallel...
International audienceIn this paper, we present the idea of cross-lingual triggers. We exploit this ...
SUMAT is a project funded through the EU ICT Policy Support Programme (2011–2014). It involves four ...
This paper presents a method for compiling a large-scale bilingual corpus from a database of movie s...
<p>In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual sub...
In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual subtit...
In this paper, we leverage the existence of dual subtitles as a source of parallel data. Dual subtit...
International audienceIn this paper, we propose a new phrase-based translation model based on inter-...
Due to the lack of ideal resources, few researchers have investigated how to improve the machine tra...
Due to the lack of ideal resources, few researchers have investigated how to improve the machine tra...
Parallel corpora extracted from online repositories of movie and TV subtitles are employed in a wide...
In this paper, we present the idea of cross-lingual triggers. We exploit this formalism in order to ...
This paper describes an approach based on word alignment on parallel corpora, which aims at facilita...