Several attempts to enhance statistical parametric speech synthesis have contemplated deep-learning-based postfilters, which learn to perform a mapping of the synthetic speech parameters to the natural ones, reducing the gap between them. In this paper, we introduce a new pre-training approach for neural networks, applied in LSTM-based postfilters for speech synthesis, with the objective of enhancing the quality of the synthesized speech in a more efficient manner. Our approach begins with an auto-regressive training of one LSTM network, whose is used as an initialization for postfilters based on a denoising autoencoder architecture. We show the advantages of this initialization on a set of multi-stream postfilters, which encompass a collec...
Over the past several decades, numerous speech enhancement techniques have been proposed to improve ...
During the 2000s decade, unit-selection based text-to-speech was the dominant commercial technology....
Nearly all Statistical Parametric Speech Synthesizers today use Mel Cepstral coefficients as the voc...
Several researchers have contemplated deep learning-based post-filters to increase the quality of st...
Several researchers have contemplated deep learning-based post-filters to increase the quality of st...
Recent developments in speech synthesis have produced systems capable of producing speech which clos...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9703).Recent developments in...
Statistical parametric speech synthesis based on Hidden Markov Models has been an important techniq...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12088).The quality of speech...
In this paper, we present a new approach for fundamental frequency detection in noisy speech, based ...
In this chapter, we introduce hybrid postfilters into speech synthesis, with the objective of enhan...
Currently, the most popular speech recognition systems are based on unit selection - decision tree a...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11289).In this paper, we car...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811).Automatic speech recog...
In this paper we propose a deep neural network to model the conditional probability of the spectral ...
Over the past several decades, numerous speech enhancement techniques have been proposed to improve ...
During the 2000s decade, unit-selection based text-to-speech was the dominant commercial technology....
Nearly all Statistical Parametric Speech Synthesizers today use Mel Cepstral coefficients as the voc...
Several researchers have contemplated deep learning-based post-filters to increase the quality of st...
Several researchers have contemplated deep learning-based post-filters to increase the quality of st...
Recent developments in speech synthesis have produced systems capable of producing speech which clos...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9703).Recent developments in...
Statistical parametric speech synthesis based on Hidden Markov Models has been an important techniq...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12088).The quality of speech...
In this paper, we present a new approach for fundamental frequency detection in noisy speech, based ...
In this chapter, we introduce hybrid postfilters into speech synthesis, with the objective of enhan...
Currently, the most popular speech recognition systems are based on unit selection - decision tree a...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11289).In this paper, we car...
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811).Automatic speech recog...
In this paper we propose a deep neural network to model the conditional probability of the spectral ...
Over the past several decades, numerous speech enhancement techniques have been proposed to improve ...
During the 2000s decade, unit-selection based text-to-speech was the dominant commercial technology....
Nearly all Statistical Parametric Speech Synthesizers today use Mel Cepstral coefficients as the voc...