We evaluate some recent developments in recurrent neural network (RNN) based speech enhancement in the light of noise-robust automatic speech recognition (ASR). The proposed framework is based on Long Short-Term Memory (LSTM) RNNs which are discriminatively trained according to an optimal speech reconstruction objective. We demonstrate that LSTM speech enhancement, even when used 'naively' as front-end processing, delivers competitive results on the CHiME-2 speech recognition task. Furthermore, simple, feature-level fusion based extensions to the framework are proposed to improve the integration with the ASR back-end. These yield a best result of 13.76% average word error rate, which is, to our knowledge, the best score to date
Speaker adaptation of deep neural networks (DNNs) based acoustic models is still a challenging area ...
Long short-term memory (LSTM) has been effectively used to represent sequential data in recent years...
Speaker adaptation of deep neural networks (DNNs) based acoustic models is still a challenging area ...
International audienceWe evaluate some recent developments in recurrent neural network (RNN) based s...
Long Short-Term Memory (LSTM) recurrent neural network has proven effective in mod-eling speech and ...
Long Short-Term Memory (LSTM) recurrent neural network has proven effective in modeling speech and ...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11289).In this paper, we car...
Currently, the most popular speech recognition systems are based on unit selection - decision tree a...
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-...
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-...
In this paper, we present a new approach for fundamental frequency detection in noisy speech, based ...
In this paper, we show how new training principles and opti-mization techniques for neural networks ...
Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) architecture that has been designe...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811).Automatic speech recog...
End-to-end speech recognition is the problem of mapping raw audio signal all the way to text. In doi...
Speaker adaptation of deep neural networks (DNNs) based acoustic models is still a challenging area ...
Long short-term memory (LSTM) has been effectively used to represent sequential data in recent years...
Speaker adaptation of deep neural networks (DNNs) based acoustic models is still a challenging area ...
International audienceWe evaluate some recent developments in recurrent neural network (RNN) based s...
Long Short-Term Memory (LSTM) recurrent neural network has proven effective in mod-eling speech and ...
Long Short-Term Memory (LSTM) recurrent neural network has proven effective in modeling speech and ...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11289).In this paper, we car...
Currently, the most popular speech recognition systems are based on unit selection - decision tree a...
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-...
Long short-term memory (LSTM) recurrent neural networks (RNNs) have been shown to give state-of-the-...
In this paper, we present a new approach for fundamental frequency detection in noisy speech, based ...
In this paper, we show how new training principles and opti-mization techniques for neural networks ...
Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) architecture that has been designe...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811).Automatic speech recog...
End-to-end speech recognition is the problem of mapping raw audio signal all the way to text. In doi...
Speaker adaptation of deep neural networks (DNNs) based acoustic models is still a challenging area ...
Long short-term memory (LSTM) has been effectively used to represent sequential data in recent years...
Speaker adaptation of deep neural networks (DNNs) based acoustic models is still a challenging area ...