The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisation. Speaker embeddings play a crucial role in the performance of diarisation systems, but they often capture spurious information such as noise, adversely affecting performance. Our previous work has proposed an auto-encoder-based dimensionality reduction module to help remove the redundant information. However, they do not explicitly separate such information and have also been found to be sensitive to hyper-parameter values. To this end, we propose two contributions to overcome these issues: (i) a novel dimensionality reduction framework that can disentangle spurious information from the speaker embeddings; (ii) the use of speech activity v...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
Speaker embeddings represent a means to extract representative vectorial representations from a spee...
This paper investigates robust privacy-sensitive audio features for speaker diarization in multipart...
Speech 'in-the-wild' is a handicap for speaker recognition systems due to the variability induced by...
Speaker embedding extractors significantly influence the performance of clustering-based speaker dia...
Over the last few years, deep learning has grown in popularity for speaker verification, identificat...
Speech -in-the-wild- is a handicap for speaker recognition systems due to the variability induced by...
Speech -in-the-wild- is a handicap for speaker recognition systems due to the variability induced by...
Speech -in-the-wild- is a handicap for speaker recognition systems due to the variability induced by...
This thesis describes research into speaker diarization for recorded meetings. It explores the algo...
This paper details our speaker diarization system designed for multi-domain, multi-microphone casual...
Speaker diarisation addresses the question of 'who speaks when' in audio recordings, and has been st...
This thesis describes research into speaker diarization for recorded meetings. It explores the algor...
Two new features have been proposed and used in the Rich Transcription Evaluation 2009 by the Univer...
Two new features have been proposed and used in the Rich Transcription Evaluation 2009 by the Univer...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
Speaker embeddings represent a means to extract representative vectorial representations from a spee...
This paper investigates robust privacy-sensitive audio features for speaker diarization in multipart...
Speech 'in-the-wild' is a handicap for speaker recognition systems due to the variability induced by...
Speaker embedding extractors significantly influence the performance of clustering-based speaker dia...
Over the last few years, deep learning has grown in popularity for speaker verification, identificat...
Speech -in-the-wild- is a handicap for speaker recognition systems due to the variability induced by...
Speech -in-the-wild- is a handicap for speaker recognition systems due to the variability induced by...
Speech -in-the-wild- is a handicap for speaker recognition systems due to the variability induced by...
This thesis describes research into speaker diarization for recorded meetings. It explores the algo...
This paper details our speaker diarization system designed for multi-domain, multi-microphone casual...
Speaker diarisation addresses the question of 'who speaks when' in audio recordings, and has been st...
This thesis describes research into speaker diarization for recorded meetings. It explores the algor...
Two new features have been proposed and used in the Rich Transcription Evaluation 2009 by the Univer...
Two new features have been proposed and used in the Rich Transcription Evaluation 2009 by the Univer...
State-of-the-art speaker verification systems are inherently dependent on some kind of human supervi...
Speaker embeddings represent a means to extract representative vectorial representations from a spee...
This paper investigates robust privacy-sensitive audio features for speaker diarization in multipart...