Optimization of modern ASR architectures is among the highest priority tasks since it saves many computational resources for model training and inference. The work proposes a new Uconv-Conformer architecture based on the standard Conformer model. It consistently reduces the input sequence length by 16 times, which results in speeding up the work of the intermediate layers. To solve the convergence issue connected with such a significant reduction of the time dimension, we use upsampling blocks like in the U-Net architecture to ensure the correct CTC loss calculation and stabilize network training. The Uconv-Conformer architecture appears to be not only faster in terms of training and inference speed but also shows better WER compared to the...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
Owing to the loss of effective information and incomplete feature extraction caused by the convoluti...
Sequence transducers, such as the RNN-T and the Conformer-T, are one of the most promising models of...
International audienceThe recently proposed Conformer architecture has shown state-of-the-art perfor...
This study addresses robust automatic speech recognition (ASR) by introducing a Conformer-based acou...
Reducing the latency and model size has always been a significant research problem for live Automati...
While transformers and their variant conformers show promising performance in speech recognition, th...
Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as...
The two most popular loss functions for streaming end-to-end automatic speech recognition (ASR) are ...
As one of the most popular sequence-to-sequence modeling approaches for speech recognition, the RNN-...
Conformers have recently been proposed as a promising modelling approach for automatic speech recogn...
In recent research, in the domain of speech processing, large End-to-End (E2E) systems for Automatic...
End-to-end automatic speech recognition systems represent the state of the art, but they rely on tho...
Convolutional neural networks (CNN) and Transformer have wildly succeeded in multimedia applications...
As a result of advancement in deep learning and neural network technology, end-to-end models have be...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
Owing to the loss of effective information and incomplete feature extraction caused by the convoluti...
Sequence transducers, such as the RNN-T and the Conformer-T, are one of the most promising models of...
International audienceThe recently proposed Conformer architecture has shown state-of-the-art perfor...
This study addresses robust automatic speech recognition (ASR) by introducing a Conformer-based acou...
Reducing the latency and model size has always been a significant research problem for live Automati...
While transformers and their variant conformers show promising performance in speech recognition, th...
Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as...
The two most popular loss functions for streaming end-to-end automatic speech recognition (ASR) are ...
As one of the most popular sequence-to-sequence modeling approaches for speech recognition, the RNN-...
Conformers have recently been proposed as a promising modelling approach for automatic speech recogn...
In recent research, in the domain of speech processing, large End-to-End (E2E) systems for Automatic...
End-to-end automatic speech recognition systems represent the state of the art, but they rely on tho...
Convolutional neural networks (CNN) and Transformer have wildly succeeded in multimedia applications...
As a result of advancement in deep learning and neural network technology, end-to-end models have be...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
Owing to the loss of effective information and incomplete feature extraction caused by the convoluti...
Sequence transducers, such as the RNN-T and the Conformer-T, are one of the most promising models of...