Learning robust speaker embeddings is a crucial step in speaker diarization. Deep neural networks can accurately capture speaker discriminative characteristics and popular deep embeddings such as x-vectors are nowadays a fundamental component of modern diarization systems. Recently, some improvements over the standard TDNN architecture used for x-vectors have been proposed. The ECAPA-TDNN model, for instance, has shown impressive performance in the speaker verification domain, thanks to a carefully designed neural model. In this work, we extend, for the first time, the use of the ECAPA-TDNN model to speaker diarization. Moreover, we improved its robustness with a powerful augmentation scheme that concatenates several contaminated versions ...
This paper details our speaker diarization system designed for multi-domain, multi-microphone casual...
The performance of most speaker diarization systems with x-vector embeddings is both vulnerable to n...
| openaire: EC/H2020/780069/EU//MeMADIn speaker-aware training, a speaker embedding is appended to D...
We investigate the use of deep neural networks (DNNs) for the speaker diarization task to improve pe...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
<p>This paper describes the Intelligent Voice (IV) speaker diarization system for IberSPEECH-RTVE 20...
Speaker diarization finds contiguous speaker segments in an audio recording and clusters them by spe...
Current speaker verification techniques rely on a neural network to extract speaker representations....
Current speaker verification techniques rely on a neural network to extract speaker representations....
Recently, a fully supervised speaker diarization approach was proposed (UIS-RNN) which models speake...
The aim of this work is to gain insights into how the deep neural network (DNN) models should be tra...
The aim of this work is to gain insights into how the deep neural network (DNN) models should be tra...
The recent speaker embeddings framework has been shown to provide excellent performance on the task ...
In this paper we investigate the use of deep neural networks (DNNs) for a small footprint text-depen...
Speaker embeddings represent a means to extract representative vectorial representations from a spee...
This paper details our speaker diarization system designed for multi-domain, multi-microphone casual...
The performance of most speaker diarization systems with x-vector embeddings is both vulnerable to n...
| openaire: EC/H2020/780069/EU//MeMADIn speaker-aware training, a speaker embedding is appended to D...
We investigate the use of deep neural networks (DNNs) for the speaker diarization task to improve pe...
The objective of this work is to study state-of-the-art deep neural networks based speaker verificat...
<p>This paper describes the Intelligent Voice (IV) speaker diarization system for IberSPEECH-RTVE 20...
Speaker diarization finds contiguous speaker segments in an audio recording and clusters them by spe...
Current speaker verification techniques rely on a neural network to extract speaker representations....
Current speaker verification techniques rely on a neural network to extract speaker representations....
Recently, a fully supervised speaker diarization approach was proposed (UIS-RNN) which models speake...
The aim of this work is to gain insights into how the deep neural network (DNN) models should be tra...
The aim of this work is to gain insights into how the deep neural network (DNN) models should be tra...
The recent speaker embeddings framework has been shown to provide excellent performance on the task ...
In this paper we investigate the use of deep neural networks (DNNs) for a small footprint text-depen...
Speaker embeddings represent a means to extract representative vectorial representations from a spee...
This paper details our speaker diarization system designed for multi-domain, multi-microphone casual...
The performance of most speaker diarization systems with x-vector embeddings is both vulnerable to n...
| openaire: EC/H2020/780069/EU//MeMADIn speaker-aware training, a speaker embedding is appended to D...