While promising performance for speaker verification has been achieved by deep speaker embeddings, the advantage would reduce in the case of speaking-style variability. Speaking rate mismatch is often observed in practical speaker verification systems, which may actually degrade the system performance. To reduce intra-class discrepancy caused by speaking rate, we propose a deep representation decomposition approach with adversarial learning to learn speaking rate-invariant speaker embeddings. Specifically, adopting an attention block, we decompose the original embedding into an identity-related component and a rate-related component through multi-task training. Additionally, to reduce the latent relationship between the two decomposed compo...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
This paper presents the SJTU system for both text-dependent and text-independent tasks in short-dura...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
Speaker verification (SV) is a task to verify a claimed identity from the voice signal. A well-perfo...
In this work we improve the performance of a speaker verification system by matching the feature vec...
Advancements in automatic speaker verification (ASV) can be considered to be primarily limited to im...
Speaker embeddings represent a means to extract representative vectorial representations from a spee...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
Automatic Speaker Verification (ASV) is a critical task in pattern recognition and has been applied ...
Automatic Speaker Verification (ASV) is a critical task in pattern recognition and has been applied ...
Training robust speaker verification systems without speaker labels has long been a challenging task...
Model-based approaches to Speaker Verification (SV), such as Joint Factor Analysis (JFA), i-vector a...
Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks ca...
In this paper, we address the problem of speaker verification in conditions unseen or unknown during...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
This paper presents the SJTU system for both text-dependent and text-independent tasks in short-dura...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
Speaker verification (SV) is a task to verify a claimed identity from the voice signal. A well-perfo...
In this work we improve the performance of a speaker verification system by matching the feature vec...
Advancements in automatic speaker verification (ASV) can be considered to be primarily limited to im...
Speaker embeddings represent a means to extract representative vectorial representations from a spee...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
Automatic Speaker Verification (ASV) is a critical task in pattern recognition and has been applied ...
Automatic Speaker Verification (ASV) is a critical task in pattern recognition and has been applied ...
Training robust speaker verification systems without speaker labels has long been a challenging task...
Model-based approaches to Speaker Verification (SV), such as Joint Factor Analysis (JFA), i-vector a...
Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks ca...
In this paper, we address the problem of speaker verification in conditions unseen or unknown during...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
This paper presents the SJTU system for both text-dependent and text-independent tasks in short-dura...