Frame-online speech enhancement systems in the short-time Fourier transform (STFT) domain usually have an algorithmic latency equal to the window size due to the use of overlap-add in the inverse STFT (iSTFT). This algorithmic latency allows the enhancement models to leverage future contextual information up to a length equal to the window size. However, this information is only partially leveraged by current frame-online systems. To fully exploit it, we propose an overlapped-frame prediction technique for deep learning based frame-online speech enhancement, where at each frame our deep neural network (DNN) predicts the current and several past frames that are necessary for overlap-add, instead of only predicting the current frame. In addit...
Advancements in machine learning techniques have promoted the use of deep neural networks (DNNs) for...
This thesis focuses on the development of neural network acoustic models for large vocabulary contin...
We present a neural vocoder designed with low-powered Alternative and Augmentative Communication dev...
Deep learning based speech enhancement in the short-time Fourier transform (STFT) domain typically u...
This paper describes a practical dual-process speech enhancement system that adapts environment-sens...
International audienceThis paper proposes a delayed subband LSTM network for online monaural (single...
While phase-aware speech processing has been receiving increasing attention in recent years, most na...
The combination of a deep neural network (DNN) -based speech enhancement (SE) front-end and an autom...
3D speech enhancement can effectively improve the auditory experience and plays a crucial role in au...
This paper describes the practical response- and performance-aware development of online speech enha...
This work focuses on online dereverberation for hearing devices using the weighted prediction error ...
While deep neural networks have shown impressive results in automatic speaker recognition and relate...
This work proposes a new learning target based on reverberation time shortening (RTS) for speech der...
It is highly desirable that speech enhancement algorithms can achieve good performance while keeping...
Score-based generative models (SGMs) have recently shown impressive results for difficult generative...
Advancements in machine learning techniques have promoted the use of deep neural networks (DNNs) for...
This thesis focuses on the development of neural network acoustic models for large vocabulary contin...
We present a neural vocoder designed with low-powered Alternative and Augmentative Communication dev...
Deep learning based speech enhancement in the short-time Fourier transform (STFT) domain typically u...
This paper describes a practical dual-process speech enhancement system that adapts environment-sens...
International audienceThis paper proposes a delayed subband LSTM network for online monaural (single...
While phase-aware speech processing has been receiving increasing attention in recent years, most na...
The combination of a deep neural network (DNN) -based speech enhancement (SE) front-end and an autom...
3D speech enhancement can effectively improve the auditory experience and plays a crucial role in au...
This paper describes the practical response- and performance-aware development of online speech enha...
This work focuses on online dereverberation for hearing devices using the weighted prediction error ...
While deep neural networks have shown impressive results in automatic speaker recognition and relate...
This work proposes a new learning target based on reverberation time shortening (RTS) for speech der...
It is highly desirable that speech enhancement algorithms can achieve good performance while keeping...
Score-based generative models (SGMs) have recently shown impressive results for difficult generative...
Advancements in machine learning techniques have promoted the use of deep neural networks (DNNs) for...
This thesis focuses on the development of neural network acoustic models for large vocabulary contin...
We present a neural vocoder designed with low-powered Alternative and Augmentative Communication dev...