In this paper, we present a new approach for fundamental frequency detection in noisy speech, based on Long Short-term Memory Neural Networks (LSTM). Fundamental frequency is one of the most important parameters of human speech. Its detection is relevant in many speech signal processing areas and remains an important challenge for severely degraded signals. In previous references for speech enhancement and noise reduction tasks, LSTM has been initialized with random weights, following a back-propagation through time algorithm to adjust them. Our proposal is an alternative for a more efficient initialization, based on a supervised training using an Auto-associative network. This initialization is a better starting point for the fundamental f...
End-to-end speech recognition is the problem of mapping raw audio signal all the way to text. In doi...
Acoustic novelty detection aims at identifying abnormal/novel acoustic signals which differ from the...
Speech intelligibility can be degraded due to multiple factors, such as noisy environments, technica...
Part of the Communications in Computer and Information Science book series (CCIS, volume 1087).The d...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11289).In this paper, we car...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811).Automatic speech recog...
Several researchers have contemplated deep learning-based post-filters to increase the quality of st...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12088).The quality of speech...
Several attempts to enhance statistical parametric speech synthesis have contemplated deep-learning-...
We evaluate some recent developments in recurrent neural network (RNN) based speech enhancement in t...
The thesis focuses on the use of deep recurrent neural network, architecture Long Short-Term Memory ...
International audienceWe propose a method using a long short-term memory (LSTM) network to estimate ...
In this paper a novel speaker recognition system is introduced. Automated speaker recognition has be...
Abstract. We apply Long Short-Term Memory (LSTM) recurrent neural networks to a large corpus of unpr...
International audienceWe propose a multichannel speech enhancement method using along short-term mem...
End-to-end speech recognition is the problem of mapping raw audio signal all the way to text. In doi...
Acoustic novelty detection aims at identifying abnormal/novel acoustic signals which differ from the...
Speech intelligibility can be degraded due to multiple factors, such as noisy environments, technica...
Part of the Communications in Computer and Information Science book series (CCIS, volume 1087).The d...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11289).In this paper, we car...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811).Automatic speech recog...
Several researchers have contemplated deep learning-based post-filters to increase the quality of st...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12088).The quality of speech...
Several attempts to enhance statistical parametric speech synthesis have contemplated deep-learning-...
We evaluate some recent developments in recurrent neural network (RNN) based speech enhancement in t...
The thesis focuses on the use of deep recurrent neural network, architecture Long Short-Term Memory ...
International audienceWe propose a method using a long short-term memory (LSTM) network to estimate ...
In this paper a novel speaker recognition system is introduced. Automated speaker recognition has be...
Abstract. We apply Long Short-Term Memory (LSTM) recurrent neural networks to a large corpus of unpr...
International audienceWe propose a multichannel speech enhancement method using along short-term mem...
End-to-end speech recognition is the problem of mapping raw audio signal all the way to text. In doi...
Acoustic novelty detection aims at identifying abnormal/novel acoustic signals which differ from the...
Speech intelligibility can be degraded due to multiple factors, such as noisy environments, technica...