Most state-of-the-art Deep Learning systems for text-independent speaker verification are based on speaker embedding extractors. These architectures are commonly composed of a feature extractor front-end together with a pooling layer to encode variable-length utterances into fixed-length speaker vectors. In this paper we present Double Multi-Head Attention (MHA) pooling, which extends our previous approach based on Self MHA. An additional self attention layer is added to the pooling layer that summarizes the context vectors produced by MHA into a unique speaker representation. This method enhances the pooling mechanism by giving weights to the information captured for each head and it results in creating more discriminative speaker embeddin...
The objective of this paper is speaker recognition under noisy and unconstrained conditions. We mak...
In recent years, self-supervised learning paradigm has received extensive attention due to its great...
| openaire: EC/H2020/780069/EU//MeMADIn speaker-aware training, a speaker embedding is appended to D...
Most state-of-the-art Deep Learning (DL) approaches forspeaker recognition work on a short utterance...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
In the recent past, Deep neural networks became the most successful approach to extract the speaker ...
Convolutional neural networks (CNNs) have significantly promoted the development of speaker verifica...
Abstract Personalized voice triggering is a key technology in voice assistants and serves as the fir...
This paper presents an improved deep embedding learning method based on convolutional neural network...
In this paper, a hierarchical attention network is proposed to generate robust utterance-level embed...
The lack of labeled background data makes a big performance gap between cosine and Probabilistic Lin...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
Current speaker verification techniques rely on a neural network to extract speaker representations....
The performance of the automatic speaker recognition system is becoming more and more accurate, with...
The objective of this paper is speaker recognition under noisy and unconstrained conditions. We mak...
In recent years, self-supervised learning paradigm has received extensive attention due to its great...
| openaire: EC/H2020/780069/EU//MeMADIn speaker-aware training, a speaker embedding is appended to D...
Most state-of-the-art Deep Learning (DL) approaches forspeaker recognition work on a short utterance...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
This paper explores three novel approaches to improve the performance of speaker verification (SV) s...
In the recent past, Deep neural networks became the most successful approach to extract the speaker ...
Convolutional neural networks (CNNs) have significantly promoted the development of speaker verifica...
Abstract Personalized voice triggering is a key technology in voice assistants and serves as the fir...
This paper presents an improved deep embedding learning method based on convolutional neural network...
In this paper, a hierarchical attention network is proposed to generate robust utterance-level embed...
The lack of labeled background data makes a big performance gap between cosine and Probabilistic Lin...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
Current speaker verification techniques rely on a neural network to extract speaker representations....
The performance of the automatic speaker recognition system is becoming more and more accurate, with...
The objective of this paper is speaker recognition under noisy and unconstrained conditions. We mak...
In recent years, self-supervised learning paradigm has received extensive attention due to its great...
| openaire: EC/H2020/780069/EU//MeMADIn speaker-aware training, a speaker embedding is appended to D...