Learning an effective speaker representation is crucial for achieving reliable performance in speaker verification tasks. Speech signals are high-dimensional, long, and variable-length sequences that entail a complex hierarchical structure. Signals may contain diverse information at each time-frequency (TF) location. For example, it may be more beneficial to focus on high-energy parts for phoneme classes such as fricatives. The standard convolutional layer that operates on neighboring local regions cannot capture the complex TF global context information. In this study, a general global time-frequency context modeling framework is proposed to leverage the context information specifically for speaker representation modeling. First, a data-dr...
Speaker recognition is becoming an increasingly popular technology in today’s society. Besides being...
Current speaker verification techniques rely on a neural network to extract speaker representations....
Speech is at the core of human communication. Speaking and listing comes so natural to us that we do...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
The objective of this work is to develop a speaker recognition model to be used in diverse scenarios...
Convolutional neural networks (CNNs) have significantly promoted the development of speaker verifica...
Speech technology has developed to levels equivalent with human parity through the use of deep neura...
In this paper, a novel architecture for speaker recognition is proposed by cascading speech enhancem...
In the recent past, Deep neural networks became the most successful approach to extract the speaker ...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
Most state-of-the-art Deep Learning (DL) approaches forspeaker recognition work on a short utterance...
To extract accurate speaker information for text-independent speaker verification, temporal dynamic ...
In this paper, a hierarchical attention network is proposed to generate utterance-level embeddings (...
Effective speaker identification is essential for achieving robust speaker recognition in real-world...
Deep learning and neural network research has grown significantly in the fields of automatic speech ...
Speaker recognition is becoming an increasingly popular technology in today’s society. Besides being...
Current speaker verification techniques rely on a neural network to extract speaker representations....
Speech is at the core of human communication. Speaking and listing comes so natural to us that we do...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
The objective of this work is to develop a speaker recognition model to be used in diverse scenarios...
Convolutional neural networks (CNNs) have significantly promoted the development of speaker verifica...
Speech technology has developed to levels equivalent with human parity through the use of deep neura...
In this paper, a novel architecture for speaker recognition is proposed by cascading speech enhancem...
In the recent past, Deep neural networks became the most successful approach to extract the speaker ...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
Most state-of-the-art Deep Learning (DL) approaches forspeaker recognition work on a short utterance...
To extract accurate speaker information for text-independent speaker verification, temporal dynamic ...
In this paper, a hierarchical attention network is proposed to generate utterance-level embeddings (...
Effective speaker identification is essential for achieving robust speaker recognition in real-world...
Deep learning and neural network research has grown significantly in the fields of automatic speech ...
Speaker recognition is becoming an increasingly popular technology in today’s society. Besides being...
Current speaker verification techniques rely on a neural network to extract speaker representations....
Speech is at the core of human communication. Speaking and listing comes so natural to us that we do...