Recent research on the time-domain audio separation networks (TasNets) has brought great success to speech separation. Nevertheless, conventional TasNets struggle to satisfy the memory and latency constraints in industrial applications. In this regard, we design a low-cost high-performance architecture, namely, globally attentive locally recurrent (GALR) network. Alike the dual-path RNN (DPRNN), we first split a feature sequence into 2D segments and then process the sequence along both the intra- and inter-segment dimensions. Our main innovation lies in that, on top of features recurrently processed along the inter-segment dimensions, GALR applies a self-attention mechanism to the sequence along the inter-segment dimension, which aggregates...
In this paper, we compare different deep neural networks (DNN) in extracting speech signals from com...
Recurrent Neural Networks (RNN) provide a solution for low cost Speech Recognition Systems (SRS) in ...
A novel extension to recurrent timing neural networks (RTNNs) is proposed which allows such networks...
One of the leading single-channel speech separation (SS) models is based on a TasNet with a dual-pat...
Speaker-independent speech separation has achieved remarkable performance in recent years with the d...
Recurrent neural networks (RNNs) have shown an ability to model temporal dependencies. However, the ...
Separation of speech mixtures in noisy and reverberant environments remains a challenging task for s...
3D speech enhancement can effectively improve the auditory experience and plays a crucial role in au...
This thesis takes the classical signal processing problem of separating the speech of a target speak...
Combining different models is a common strategy to build a good audio source separation system. In t...
This thesis focuses on the development of neural network acoustic models for large vocabulary contin...
Combining different models is a common strategy to build a good audio source separation system. In t...
In this thesis, a low-latency variant of speaker-independent deep clustering method is proposed for...
This paper presents a new approach based on recurrent neural networks (RNN) to the multiclass audio ...
Despite the recent progress of automatic speech recognition (ASR) driven by deep learning, conversat...
In this paper, we compare different deep neural networks (DNN) in extracting speech signals from com...
Recurrent Neural Networks (RNN) provide a solution for low cost Speech Recognition Systems (SRS) in ...
A novel extension to recurrent timing neural networks (RTNNs) is proposed which allows such networks...
One of the leading single-channel speech separation (SS) models is based on a TasNet with a dual-pat...
Speaker-independent speech separation has achieved remarkable performance in recent years with the d...
Recurrent neural networks (RNNs) have shown an ability to model temporal dependencies. However, the ...
Separation of speech mixtures in noisy and reverberant environments remains a challenging task for s...
3D speech enhancement can effectively improve the auditory experience and plays a crucial role in au...
This thesis takes the classical signal processing problem of separating the speech of a target speak...
Combining different models is a common strategy to build a good audio source separation system. In t...
This thesis focuses on the development of neural network acoustic models for large vocabulary contin...
Combining different models is a common strategy to build a good audio source separation system. In t...
In this thesis, a low-latency variant of speaker-independent deep clustering method is proposed for...
This paper presents a new approach based on recurrent neural networks (RNN) to the multiclass audio ...
Despite the recent progress of automatic speech recognition (ASR) driven by deep learning, conversat...
In this paper, we compare different deep neural networks (DNN) in extracting speech signals from com...
Recurrent Neural Networks (RNN) provide a solution for low cost Speech Recognition Systems (SRS) in ...
A novel extension to recurrent timing neural networks (RTNNs) is proposed which allows such networks...