This work is concerned with using deep neural networks for estimating binary masks within a speech enhancement framework. We first examine the effect of supplementing the audio features used in mask estimation with visual speech information. Visual speech is known to be robust to noise although not necessarily as discriminative as audio features, particularly at higher signal-to-noise ratios. Furthermore, most DNN approaches to mask estimate use the cross-entropy (CE) loss function which aims to maximise classification accuracy. However, we first propose a loss function that aims to maximise the hit minus false-alarm (HIT-FA) rate of the mask, which is known to correlate more closely to speech intelligibility than classification accuracy. W...
Abstract The performance of the existing speech enhancement algorithms is not ideal in low signal-to...
Computational speech segregation attempts to automatically separate speech from noise. This is chall...
Deep neural networks (DNN) have recently been shown to give state-of-the-art performance in monaural...
This work proposes and compares perceptually motivated loss functions for deep learning based binary...
International audienceHuman auditory cortex excels at selectively suppressing background noise to fo...
Estimating time-frequency domain masks for single-channel speech enhancement using deep learning met...
The aim of the work in this thesis is to explore how visual speech can be used within monaural maski...
This work examines whether visual speech infor- mation can be effective within audio masking-based s...
Originally, ideal binary mask (idbm) techniques have been used as a tool for studying aspects of the...
Deep neural networks (DNNs) are usually used for single channel source separation to predict either ...
It is known that applying a time-frequency binary mask to very noisy speech can improve its intellig...
Forward masking models have been used successfully in speech enhancement and audio coding. Presently...
Additive noise has long been an issue for robust automatic speech recognition (ASR) systems. One app...
Forward masking models have been used successfully in speech enhancement and audio coding. Currently...
In real rooms, recorded speech usually contains reverberation, which degrades the quality and intell...
Abstract The performance of the existing speech enhancement algorithms is not ideal in low signal-to...
Computational speech segregation attempts to automatically separate speech from noise. This is chall...
Deep neural networks (DNN) have recently been shown to give state-of-the-art performance in monaural...
This work proposes and compares perceptually motivated loss functions for deep learning based binary...
International audienceHuman auditory cortex excels at selectively suppressing background noise to fo...
Estimating time-frequency domain masks for single-channel speech enhancement using deep learning met...
The aim of the work in this thesis is to explore how visual speech can be used within monaural maski...
This work examines whether visual speech infor- mation can be effective within audio masking-based s...
Originally, ideal binary mask (idbm) techniques have been used as a tool for studying aspects of the...
Deep neural networks (DNNs) are usually used for single channel source separation to predict either ...
It is known that applying a time-frequency binary mask to very noisy speech can improve its intellig...
Forward masking models have been used successfully in speech enhancement and audio coding. Presently...
Additive noise has long been an issue for robust automatic speech recognition (ASR) systems. One app...
Forward masking models have been used successfully in speech enhancement and audio coding. Currently...
In real rooms, recorded speech usually contains reverberation, which degrades the quality and intell...
Abstract The performance of the existing speech enhancement algorithms is not ideal in low signal-to...
Computational speech segregation attempts to automatically separate speech from noise. This is chall...
Deep neural networks (DNN) have recently been shown to give state-of-the-art performance in monaural...