Growing needs in localising audiovisual content in multiple languages through subtitles call for the development of automatic solutions for human subtitling. Neural Machine Translation (NMT) can contribute to the automatisation of subtitling, facilitating the work of human subtitlers and reducing turn-around times and related costs. NMT requires high-quality, large, task-specific training data. The existing subtitling corpora, however, are missing both alignments to the source language audio and important information about subtitle breaks. This poses a significant limitation for developing efficient automatic approaches for subtitling, since the length and form of a subtitle directly depends on the duration of the utterance. In this work, w...
Speech translation (ST) has lately received growing interest for the generation of subtitles without...
International audienceWe present baseline results for a new task of automatic segmentation of Sign L...
This article proposes a new methodology for multimodal corpus analysis. It does so by particularly f...
Growing needs in localising multimedia content for global audiences have resulted in Neural Machine ...
Growing needs in translating multimedia content have resulted in Neural Machine Translation (NMT) ...
Subtitles, in order to achieve their purpose of transmitting information, need to be easily readable...
We implemented a neural machine translation system that uses automatic sequence tagging to improve t...
This paper presents a method for compiling a large-scale bilingual corpus from a database of movie s...
Due to the lack of ideal resources, few researchers have investigated how to improve the machine tra...
Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kri...
Parallel corpora extracted from online repositories of movie and TV subtitles are employed in a wide...
International audienceThis paper focuses on two aspects of Machine Translation: parallel corpora and...
Speech translation for subtitling (SubST) is the task of automatically translating speech data into ...
Abstract. This paper describes a methodology for constructing aligned Ger-man-Chinese corpora from m...
Recently, with the development of Speech to Text, which converts voice to text, and machine translat...
Speech translation (ST) has lately received growing interest for the generation of subtitles without...
International audienceWe present baseline results for a new task of automatic segmentation of Sign L...
This article proposes a new methodology for multimodal corpus analysis. It does so by particularly f...
Growing needs in localising multimedia content for global audiences have resulted in Neural Machine ...
Growing needs in translating multimedia content have resulted in Neural Machine Translation (NMT) ...
Subtitles, in order to achieve their purpose of transmitting information, need to be easily readable...
We implemented a neural machine translation system that uses automatic sequence tagging to improve t...
This paper presents a method for compiling a large-scale bilingual corpus from a database of movie s...
Due to the lack of ideal resources, few researchers have investigated how to improve the machine tra...
Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kri...
Parallel corpora extracted from online repositories of movie and TV subtitles are employed in a wide...
International audienceThis paper focuses on two aspects of Machine Translation: parallel corpora and...
Speech translation for subtitling (SubST) is the task of automatically translating speech data into ...
Abstract. This paper describes a methodology for constructing aligned Ger-man-Chinese corpora from m...
Recently, with the development of Speech to Text, which converts voice to text, and machine translat...
Speech translation (ST) has lately received growing interest for the generation of subtitles without...
International audienceWe present baseline results for a new task of automatic segmentation of Sign L...
This article proposes a new methodology for multimodal corpus analysis. It does so by particularly f...