Speech distortions are a long-standing problem that degrades the performance of supervisely trained speech processing models. It is high time that we enhance the robustness of speech processing models to obtain good performance when encountering speech distortions while not hurting the original performance on clean speech. In this work, we propose to improve the robustness of speech processing models by domain adversarial training (DAT). We conducted experiments based on the SUPERB framework on five different speech processing tasks. In case we do not always have knowledge of the distortion types for speech data, we analyzed the binary-domain and multi-domain settings, where the former treats all distorted speech as one domain, and the latt...
We present a self-supervised speech restoration method without paired speech corpora. Because the pr...
The performances of automatic speech recognition (ASR) systems degrade drastically under noisy condi...
While Automatic Speech Recognition (ASR) models have shown significant advances with the introductio...
We present RemixIT, a simple yet effective self-supervised method for training speech enhancement wi...
In real-world applications, speaker recognition models often face various domain-mismatch challenges...
We propose RemixIT, a simple and novel self-supervised training method for speech enhancement. The p...
Automatic speech recognition models are often adapted to improve their accuracy in a new domain. A p...
The cross-domain performance of automatic speech recognition (ASR) could be severely hampered due to...
Recent advances with self-supervised learning have allowed speech recognition systems to achieve sta...
The performance of deep learning approaches to speech enhancement degrades significantly in face of ...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
In this paper, we explore an improved framework to train a monoaural neural enhancement model for ro...
Automatic speech recognition (ASR) has shown rapid advances in recent years but still degrades signi...
Advances in self-supervised learning have significantly reduced the amount of transcribed audio requ...
Speech enhancement aims to suppress background noise in noisy speech signals in order to improve spe...
We present a self-supervised speech restoration method without paired speech corpora. Because the pr...
The performances of automatic speech recognition (ASR) systems degrade drastically under noisy condi...
While Automatic Speech Recognition (ASR) models have shown significant advances with the introductio...
We present RemixIT, a simple yet effective self-supervised method for training speech enhancement wi...
In real-world applications, speaker recognition models often face various domain-mismatch challenges...
We propose RemixIT, a simple and novel self-supervised training method for speech enhancement. The p...
Automatic speech recognition models are often adapted to improve their accuracy in a new domain. A p...
The cross-domain performance of automatic speech recognition (ASR) could be severely hampered due to...
Recent advances with self-supervised learning have allowed speech recognition systems to achieve sta...
The performance of deep learning approaches to speech enhancement degrades significantly in face of ...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
In this paper, we explore an improved framework to train a monoaural neural enhancement model for ro...
Automatic speech recognition (ASR) has shown rapid advances in recent years but still degrades signi...
Advances in self-supervised learning have significantly reduced the amount of transcribed audio requ...
Speech enhancement aims to suppress background noise in noisy speech signals in order to improve spe...
We present a self-supervised speech restoration method without paired speech corpora. Because the pr...
The performances of automatic speech recognition (ASR) systems degrade drastically under noisy condi...
While Automatic Speech Recognition (ASR) models have shown significant advances with the introductio...