Reducing the latency and model size has always been a significant research problem for live Automatic Speech Recognition (ASR) application scenarios. Along this direction, model quantization has become an increasingly popular approach to compress neural networks and reduce computation cost. Most of the existing practical ASR systems apply post-training 8-bit quantization. To achieve a higher compression rate without introducing additional performance regression, in this study, we propose to develop 4-bit ASR models with native quantization aware training, which leverages native integer operations to effectively optimize both training and inference. We conducted two experiments on state-of-the-art Conformer-based ASR models to evaluate our p...
The performance of the speech recognition systems to translate voice to text is still an issue in la...
While transformers and their variant conformers show promising performance in speech recognition, th...
In recent research, in the domain of speech processing, large End-to-End (E2E) systems for Automatic...
This study addresses robust automatic speech recognition (ASR) by introducing a Conformer-based acou...
Optimization of modern ASR architectures is among the highest priority tasks since it saves many com...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
International audienceThe recently proposed Conformer architecture has shown state-of-the-art perfor...
As a result of advancement in deep learning and neural network technology, end-to-end models have be...
With the skyrocketing popularity of mobile devices, new processing methods tailored to a specic appl...
Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as...
ASR error correction continues to serve as an important part of post-processing for speech recogniti...
Abstract | Most contemporary ASR systems running on desktops use continuous-density HMMs (CHMM) with...
There is fast growing research on designing energy-ecient computational devices and applications run...
Training an automatic speech recognition (ASR) post-processor based on sequence-to-sequence (S2S) re...
Over the past decades, the dominant approach towards building automatic speech recognition (ASR) sys...
The performance of the speech recognition systems to translate voice to text is still an issue in la...
While transformers and their variant conformers show promising performance in speech recognition, th...
In recent research, in the domain of speech processing, large End-to-End (E2E) systems for Automatic...
This study addresses robust automatic speech recognition (ASR) by introducing a Conformer-based acou...
Optimization of modern ASR architectures is among the highest priority tasks since it saves many com...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousand...
International audienceThe recently proposed Conformer architecture has shown state-of-the-art perfor...
As a result of advancement in deep learning and neural network technology, end-to-end models have be...
With the skyrocketing popularity of mobile devices, new processing methods tailored to a specic appl...
Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as...
ASR error correction continues to serve as an important part of post-processing for speech recogniti...
Abstract | Most contemporary ASR systems running on desktops use continuous-density HMMs (CHMM) with...
There is fast growing research on designing energy-ecient computational devices and applications run...
Training an automatic speech recognition (ASR) post-processor based on sequence-to-sequence (S2S) re...
Over the past decades, the dominant approach towards building automatic speech recognition (ASR) sys...
The performance of the speech recognition systems to translate voice to text is still an issue in la...
While transformers and their variant conformers show promising performance in speech recognition, th...
In recent research, in the domain of speech processing, large End-to-End (E2E) systems for Automatic...