Data augmentation is vital to the generalization ability and robustness of deep neural networks (DNNs) models. Existing augmentation methods for speaker verification manipulate the raw signal, which are time-consuming and the augmented samples lack diversity. In this paper, we present a novel difficulty-aware semantic augmentation (DASA) approach for speaker verification, which can generate diversified training samples in speaker embedding space with negligible extra computing cost. Firstly, we augment training samples by perturbing speaker embeddings along semantic directions, which are obtained from speaker-wise covariance matrices. Secondly, accurate covariance matrices are estimated from robust speaker embeddings during training, so we ...
Despite achieving satisfactory performance in speaker verification using deep neural networks, varia...
The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisati...
Speaker verification (SV) provides billions of voice-enabled devices with access control, and ensure...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
Advancements in automatic speaker verification (ASV) can be considered to be primarily limited to im...
Effective speaker identification is essential for achieving robust speaker recognition in real-world...
DNN-based speaker verification (SV) models demonstrate significant performance at relatively high co...
This paper presents an improved deep embedding learning method based on convolutional neural network...
This paper presents the SJTU system for both text-dependent and text-independent tasks in short-dura...
Modern speaker verification models use deep neural networks to encode utterance audio into discrimin...
Phonetic information is one of the most essential components of a speech signal, playing an importan...
While promising performance for speaker verification has been achieved by deep speaker embeddings, t...
Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks ca...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
International audienceModern automatic speaker verification relies largely on deep neural networks (...
Despite achieving satisfactory performance in speaker verification using deep neural networks, varia...
The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisati...
Speaker verification (SV) provides billions of voice-enabled devices with access control, and ensure...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
Advancements in automatic speaker verification (ASV) can be considered to be primarily limited to im...
Effective speaker identification is essential for achieving robust speaker recognition in real-world...
DNN-based speaker verification (SV) models demonstrate significant performance at relatively high co...
This paper presents an improved deep embedding learning method based on convolutional neural network...
This paper presents the SJTU system for both text-dependent and text-independent tasks in short-dura...
Modern speaker verification models use deep neural networks to encode utterance audio into discrimin...
Phonetic information is one of the most essential components of a speech signal, playing an importan...
While promising performance for speaker verification has been achieved by deep speaker embeddings, t...
Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks ca...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
International audienceModern automatic speaker verification relies largely on deep neural networks (...
Despite achieving satisfactory performance in speaker verification using deep neural networks, varia...
The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisati...
Speaker verification (SV) provides billions of voice-enabled devices with access control, and ensure...