End-to-end simultaneous speech translation (SimulST) outputs translation while receiving the streaming speech inputs (a.k.a. streaming speech translation), and hence needs to segment the speech inputs and then translate based on the current received speech. However, segmenting the speech inputs at unfavorable moments can disrupt the acoustic integrity and adversely affect the performance of the translation model. Therefore, learning to segment the speech inputs at those moments that are beneficial for the translation model to produce high-quality translation is the key to SimulST. Existing SimulST methods, either using the fixed-length segmentation or external segmentation model, always separate segmentation from the underlying translation ...
International audienceBoosted by the simultaneous translation shared task at IWSLT 2020, promising e...
Speech segmentation, which splits long speech into short segments, is essential for speech translati...
[EN] The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Aut...
Speech translation models are unable to directly process long audios, like TED talks, which have to ...
Transformer models using segment-based processing have been an effective architecture for simultaneo...
Simultaneous speech translation (SimulST) is the task in which output generation has to be performed...
[EN] The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Aut...
Streaming Machine Translation (MT) is the task of translating an unbounded input text stream in real...
Segmentation methods are an essential part of the simultaneous machine translation process because, ...
Simultaneous speech translation (SimulST) is a challenging task aiming to translate streaming speech...
Simultaneous translation systems start producing the output while processing the partial source sent...
Article pendent de revisió a l'Interspeech 2022Speech translation models are unable to directly proc...
Segmentation of the incoming speech stream and translating segments incrementally is a commonly used...
In simultaneous speech translation (SimulST), finding the best trade-off between high translation qu...
In this paper, we introduce our work of building a Streaming Multilingual Speech Model (SM2), which ...
International audienceBoosted by the simultaneous translation shared task at IWSLT 2020, promising e...
Speech segmentation, which splits long speech into short segments, is essential for speech translati...
[EN] The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Aut...
Speech translation models are unable to directly process long audios, like TED talks, which have to ...
Transformer models using segment-based processing have been an effective architecture for simultaneo...
Simultaneous speech translation (SimulST) is the task in which output generation has to be performed...
[EN] The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Aut...
Streaming Machine Translation (MT) is the task of translating an unbounded input text stream in real...
Segmentation methods are an essential part of the simultaneous machine translation process because, ...
Simultaneous speech translation (SimulST) is a challenging task aiming to translate streaming speech...
Simultaneous translation systems start producing the output while processing the partial source sent...
Article pendent de revisió a l'Interspeech 2022Speech translation models are unable to directly proc...
Segmentation of the incoming speech stream and translating segments incrementally is a commonly used...
In simultaneous speech translation (SimulST), finding the best trade-off between high translation qu...
In this paper, we introduce our work of building a Streaming Multilingual Speech Model (SM2), which ...
International audienceBoosted by the simultaneous translation shared task at IWSLT 2020, promising e...
Speech segmentation, which splits long speech into short segments, is essential for speech translati...
[EN] The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Aut...