Thesis (MEng)--Stellenbosch University, 2021.ENGLISH ABSTRACT: The automatic separation of raw audio into speech and non-speech is an important preprocessing step for many real-world speech processing systems. This task, known as voice activity detection (VAD), has largely been studied using constrained, synthetically corrupted speech data. In this thesis, we present a number of new VAD systems that are specifically designed for noisy in-the-wild audio. Previous research has shown the relationship between improved VAD and better downstream automatic speech recognition (ASR) performance. Our systems are to be used in a preprocessing step for low-resource ASR applied to real-world audio, and should therefore be computationally efficient, y...
Feed-forward multi-layer perceptrons (MLP) and recurrent neural networks (RNN) fed with different se...
Automatic speech recognition (ASR) does not perform equally well on every speaker. There is bias aga...
Thesis written and presented within the Erasmus+programme at University of Crete. School of Sciences...
Recently, Deep Learning has revolutionized many fields, where one such area is Voice Activity Detect...
Voice activity detection (VAD) is a fundamental task in various speech-related applications, such as...
The aim of speaker recognition and veri cation is to identify people's identity from the characteris...
This thesis describes techniques for voice activity detection in audio recordings. It is necessary t...
This paper presents a robust voice activity detector (VAD) based on hidden Markov models (HMM) to im...
It is well known that additive noise can cause a significant decrease in performance for an automati...
Voice Activity Detection (VAD) aims to distinguishcorrectly those audio segments containing humanspe...
515-522This study evaluates performance of objective measures in terms of predicting quality of nois...
Some practical uses of ASR have been implemented, including the transcription of meetings and the us...
Various ambient noises always corrupt the audio obtained in real-world environments, which partially...
Abstract. A robust and effective voice activity detection (VAD) al-gorithm is proposed for improving...
This work focuses on single-word speech recognition, where the end goal is to accurately recognize a...
Feed-forward multi-layer perceptrons (MLP) and recurrent neural networks (RNN) fed with different se...
Automatic speech recognition (ASR) does not perform equally well on every speaker. There is bias aga...
Thesis written and presented within the Erasmus+programme at University of Crete. School of Sciences...
Recently, Deep Learning has revolutionized many fields, where one such area is Voice Activity Detect...
Voice activity detection (VAD) is a fundamental task in various speech-related applications, such as...
The aim of speaker recognition and veri cation is to identify people's identity from the characteris...
This thesis describes techniques for voice activity detection in audio recordings. It is necessary t...
This paper presents a robust voice activity detector (VAD) based on hidden Markov models (HMM) to im...
It is well known that additive noise can cause a significant decrease in performance for an automati...
Voice Activity Detection (VAD) aims to distinguishcorrectly those audio segments containing humanspe...
515-522This study evaluates performance of objective measures in terms of predicting quality of nois...
Some practical uses of ASR have been implemented, including the transcription of meetings and the us...
Various ambient noises always corrupt the audio obtained in real-world environments, which partially...
Abstract. A robust and effective voice activity detection (VAD) al-gorithm is proposed for improving...
This work focuses on single-word speech recognition, where the end goal is to accurately recognize a...
Feed-forward multi-layer perceptrons (MLP) and recurrent neural networks (RNN) fed with different se...
Automatic speech recognition (ASR) does not perform equally well on every speaker. There is bias aga...
Thesis written and presented within the Erasmus+programme at University of Crete. School of Sciences...