Document-level contextual information has shown benefits to text-based machine translation, but whether and how context helps end-to-end (E2E) speech translation (ST) is still under-studied. We fill this gap through extensive experiments using a simple concatenation-based context-aware ST model, paired with adaptive feature selection on speech encodings for computational efficiency. We investigate several decoding approaches, and introduce in-model ensemble decoding which jointly performs document- and sentence-level translation using the same model. Our results on the MuST-C benchmark with Transformer demonstrate the effectiveness of context to E2E ST. Compared to sentence-level ST, context-aware ST obtains better translation quality (+0.1...
End-to-end speech-to-speech translation (S2ST) without relying on intermediate text representations ...
Training end-to-end speech translation (ST) systems requires sufficiently large-scale data, which is...
Recently, the use of syntax has very effectively improved machine translation (MT) quality in many t...
Document-level contextual information has shown benefits to text-based machine translation, but whet...
Traditional machine translation industrial systems usually handle sentences independently, neglectin...
End-to-end (E2E) speech-to-text translation (ST) often depends on pretraining its encoder and/or dec...
We present a method for introducing a text encoder into pre-trained end-to-end speech translation sy...
Direct speech-to-text translation (ST) models are usually trained on corpora segmented at sentence l...
Information in speech signals is not evenly distributed, making it an additional challenge for end-t...
Speech translation is the translation of speech in one language typically to text in another, tradit...
Fully Attentional Networks (FAN) like Transformer (Vaswani et al. 2017) has shown superior results i...
Speech translation is the translation of speech in one language typically to text in another, tradit...
International audienceWe investigate end-to-end speech-to-text translation on a corpus of audiobooks...
Speech-to-text translation (ST), which translates source language speech into target language text, ...
UnrestrictedMachine processing of speech, while has advanced significantly, is still insufficient in...
End-to-end speech-to-speech translation (S2ST) without relying on intermediate text representations ...
Training end-to-end speech translation (ST) systems requires sufficiently large-scale data, which is...
Recently, the use of syntax has very effectively improved machine translation (MT) quality in many t...
Document-level contextual information has shown benefits to text-based machine translation, but whet...
Traditional machine translation industrial systems usually handle sentences independently, neglectin...
End-to-end (E2E) speech-to-text translation (ST) often depends on pretraining its encoder and/or dec...
We present a method for introducing a text encoder into pre-trained end-to-end speech translation sy...
Direct speech-to-text translation (ST) models are usually trained on corpora segmented at sentence l...
Information in speech signals is not evenly distributed, making it an additional challenge for end-t...
Speech translation is the translation of speech in one language typically to text in another, tradit...
Fully Attentional Networks (FAN) like Transformer (Vaswani et al. 2017) has shown superior results i...
Speech translation is the translation of speech in one language typically to text in another, tradit...
International audienceWe investigate end-to-end speech-to-text translation on a corpus of audiobooks...
Speech-to-text translation (ST), which translates source language speech into target language text, ...
UnrestrictedMachine processing of speech, while has advanced significantly, is still insufficient in...
End-to-end speech-to-speech translation (S2ST) without relying on intermediate text representations ...
Training end-to-end speech translation (ST) systems requires sufficiently large-scale data, which is...
Recently, the use of syntax has very effectively improved machine translation (MT) quality in many t...