In real rooms, recorded speech usually contains reverberation, which degrades the quality and intelligibility of the speech. It has proven effective to use neural networks to estimate complex ideal ratio masks (cIRMs) using mean square error (MSE) loss for speech dereverberation. However, in some cases, when using MSE loss to estimate complex-valued masks, phase may have a disproportionate effect compared to magnitude. We propose a new weighted magnitude-phase loss function, which is divided into a magnitude component and a phase component, to train a neural network to estimate complex ideal ratio masks. A weight parameter is introduced to adjust the relative contribution of magnitude and phase to the overall loss. We find that our proposed...
Complex-valued neural networks (CVNNs) were first developed some time ago, but there has recently be...
Majority of speech processing algorithms operate only with the spectral magnitude, leaving spectral ...
Throat microphone is robust to the surrounding noise and can even pick up whispers; however, speech ...
Estimating time-frequency domain masks for single-channel speech enhancement using deep learning met...
In the past years, the usage of neural networks in speech processing has increased significantly. Th...
Most of the speech enhancement algorithms rely on estimating the magnitude spectrum of the clean spe...
This work is concerned with using deep neural networks for estimating binary masks within a speech e...
In this work we present a new single-microphone speech dereverberation algorithm. First, a performan...
Mapping and masking are two important speech enhancement methods based on deep learning that aim to ...
This paper investigates four single-channel speech dereverberation algorithms, i.e., two unsupervise...
The time-frequency mask and the magnitude spectrum are two common targets for deep learning-based sp...
This thesis explored separating impulse noise from a desired signal, for the purposes of hearing pro...
Time-frequency (T-F) domain masking is a mainstream approach for single-channel speech enhancement. ...
This paper investigates deep neural networks (DNN) based on nonlinear feature mapping and statistica...
State-of-the-art methods for monaural singing voice separation consist in estimating the magnitude s...
Complex-valued neural networks (CVNNs) were first developed some time ago, but there has recently be...
Majority of speech processing algorithms operate only with the spectral magnitude, leaving spectral ...
Throat microphone is robust to the surrounding noise and can even pick up whispers; however, speech ...
Estimating time-frequency domain masks for single-channel speech enhancement using deep learning met...
In the past years, the usage of neural networks in speech processing has increased significantly. Th...
Most of the speech enhancement algorithms rely on estimating the magnitude spectrum of the clean spe...
This work is concerned with using deep neural networks for estimating binary masks within a speech e...
In this work we present a new single-microphone speech dereverberation algorithm. First, a performan...
Mapping and masking are two important speech enhancement methods based on deep learning that aim to ...
This paper investigates four single-channel speech dereverberation algorithms, i.e., two unsupervise...
The time-frequency mask and the magnitude spectrum are two common targets for deep learning-based sp...
This thesis explored separating impulse noise from a desired signal, for the purposes of hearing pro...
Time-frequency (T-F) domain masking is a mainstream approach for single-channel speech enhancement. ...
This paper investigates deep neural networks (DNN) based on nonlinear feature mapping and statistica...
State-of-the-art methods for monaural singing voice separation consist in estimating the magnitude s...
Complex-valued neural networks (CVNNs) were first developed some time ago, but there has recently be...
Majority of speech processing algorithms operate only with the spectral magnitude, leaving spectral ...
Throat microphone is robust to the surrounding noise and can even pick up whispers; however, speech ...