In this paper, we carry out an analysis on the use of speech separation guided diarization (SSGD) in telephone conversations. SSGD performs diarization by separating the speakers signals and then applying voice activity detection on each estimated speaker signal. In particular, we compare two low-latency speech separation models. Moreover, we show a post-processing algorithm that significantly reduces the false alarm errors of a SSGD pipeline. We perform our experiments on two datasets: Fisher Corpus Part 1 and CALLHOME, evaluating both separation and diarization metrics. Notably, our SSGD DPRNN-based online model achieves 11.1% DER on CALLHOME, comparable with most state-of-the-art end-to-end neural diarization models despite being trained...
In this paper we propose a novel scheme for carrying out speaker diarization in an iterative manner....
In this paper, we focus on the issue of speaker diarization of data from a real call center. We have...
Speaker diarization for recordings made in meetings consists of identifying the number of participan...
In this paper, we carry out an analysis on the use of speech separation guided diarization (SSGD) in...
We performed an experimental review of current diarization systems for the conversational telephone ...
This paper introduces a new task termed low-latency speaker spotting (LLSS). Related to security and...
This paper investigates the application of the probabilistic linear discriminant analysis (PLDA) to ...
This paper presents an approach to the speaker diarization problem based on speech local waveform an...
The goal in Speaker Diarization (SD) is to answer the question "Who spoke when?" for a given audio w...
This paper presents an approach to the speaker diarization problem based on a step-wise form of spee...
Abstract—This paper investigates robust privacy-sensitive au-dio features for speaker diarization in...
Speaker diarization is the task of determining "who speaks when" in an audio stream that usually con...
The linguistic content of a speech signal is a source of unwanted variation which can degrade speake...
When dealing with overlapped speech, the performance of automatic speech recognition (ASR) systems s...
Abstract-- Human-Machine interaction in meetings requires the localization and identification of the...
In this paper we propose a novel scheme for carrying out speaker diarization in an iterative manner....
In this paper, we focus on the issue of speaker diarization of data from a real call center. We have...
Speaker diarization for recordings made in meetings consists of identifying the number of participan...
In this paper, we carry out an analysis on the use of speech separation guided diarization (SSGD) in...
We performed an experimental review of current diarization systems for the conversational telephone ...
This paper introduces a new task termed low-latency speaker spotting (LLSS). Related to security and...
This paper investigates the application of the probabilistic linear discriminant analysis (PLDA) to ...
This paper presents an approach to the speaker diarization problem based on speech local waveform an...
The goal in Speaker Diarization (SD) is to answer the question "Who spoke when?" for a given audio w...
This paper presents an approach to the speaker diarization problem based on a step-wise form of spee...
Abstract—This paper investigates robust privacy-sensitive au-dio features for speaker diarization in...
Speaker diarization is the task of determining "who speaks when" in an audio stream that usually con...
The linguistic content of a speech signal is a source of unwanted variation which can degrade speake...
When dealing with overlapped speech, the performance of automatic speech recognition (ASR) systems s...
Abstract-- Human-Machine interaction in meetings requires the localization and identification of the...
In this paper we propose a novel scheme for carrying out speaker diarization in an iterative manner....
In this paper, we focus on the issue of speaker diarization of data from a real call center. We have...
Speaker diarization for recordings made in meetings consists of identifying the number of participan...