Keyword spotting (KWS) refers to the task of identifying a set of predefined words in audio streams. With the advances seen recently with deep neural networks, it has become a popular technology to activate and control small devices, such as voice assistants. Relying on such models for edge devices, however, can be challenging due to hardware constraints. Moreover, as adversarial attacks have increased against voice-based technologies, developing solutions robust to such attacks has become crucial. In this work, we propose VIC-KD, a robust distillation recipe for model compression and adversarial robustness. Using self-supervised speech representations, we show that imposing geometric priors to the latent representations of both Teacher and...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
In this paper, we introduce a novel neural network training framework that increases model's adversa...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
The countermeasure (CM) model is developed to protect ASV systems from spoof attacks and prevent res...
In recent years, the development of accurate deep keyword spotting (KWS) models has resulted in KWS ...
Large-scale self-supervised pre-trained speech encoders outperform conventional approaches in speech...
Despite the remarkable performance and generalization levels of deep learning models in a wide range...
Knowledge distillation (KD) is used to enhance automatic speaker verification performance by ensurin...
Knowledge distillation is effective for producing small, high-performance neural networks for classi...
Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing...
Although the security of automatic speaker verification (ASV) is seriously threatened by recently em...
Automatic speaker verification (ASV), one of the most important technology for biometric identificat...
Voice-user interface (VUI) has exploded in popularity due to the recent advances in automatic speech...
An adversarial attack is a method to generate perturbations to the input of a machine learning model...
This paper introduces Robust Spin (R-Spin), a data-efficient self-supervised fine-tuning framework f...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
In this paper, we introduce a novel neural network training framework that increases model's adversa...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
The countermeasure (CM) model is developed to protect ASV systems from spoof attacks and prevent res...
In recent years, the development of accurate deep keyword spotting (KWS) models has resulted in KWS ...
Large-scale self-supervised pre-trained speech encoders outperform conventional approaches in speech...
Despite the remarkable performance and generalization levels of deep learning models in a wide range...
Knowledge distillation (KD) is used to enhance automatic speaker verification performance by ensurin...
Knowledge distillation is effective for producing small, high-performance neural networks for classi...
Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing...
Although the security of automatic speaker verification (ASV) is seriously threatened by recently em...
Automatic speaker verification (ASV), one of the most important technology for biometric identificat...
Voice-user interface (VUI) has exploded in popularity due to the recent advances in automatic speech...
An adversarial attack is a method to generate perturbations to the input of a machine learning model...
This paper introduces Robust Spin (R-Spin), a data-efficient self-supervised fine-tuning framework f...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
In this paper, we introduce a novel neural network training framework that increases model's adversa...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...