Recently, generative neural network models which operate directly on raw audio, such as WaveNet, have improved the state of the art in text-to-speech synthesis (TTS). Moreover, there is increasing interest in using these models as statistical vocoders for generating speech waveforms from various acoustic features. However, there is also a need to reduce the model complexity, without compromising the synthesis quality. Previously, glottal pulseforms (i.e., time-domain waveforms corresponding to the source of human voice production mechanism) have been successfully synthesized in TTS by glottal vocoders using straightforward deep feedforward neural networks. Therefore, it is natural to extend the glottal waveform modeling domain to use the mo...
Parametric speech synthesis has received increased attention in recent years following the developme...
Comunicació i pòster presentats a l'Interspeech 2017 celebrat del 20 al 24 d'agost a Estocolm, Suèci...
Estimation of glottal source information can be performed non-invasively from speech by using glotta...
Recent studies have shown that text-to-speech synthesis quality can be improved by using glottal voc...
Recent speech technology research has seen a growing interest in using WaveNets as statistical vocod...
Neural network-based models that generate glottal excitation waveforms from acoustic features have b...
The state-of-the-art in text-to-speech (TTS) synthesis has recently improved considerably due to nov...
Glottal volume velocity waveform, the acoustical excitation of voiced speech, cannot be acquired thr...
This paper proposes the use of the Liljencrants-Fant model (LFmodel) to represent the glottal source...
This paper proposes the use of the Liljencrants-Fant model (LF-model) to represent the glottal sourc...
A vocoder is used to express a speech waveform with a controllable parametric representation that ca...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
We have studied the analysis and synthesis of speech with a glottal-excited speech synthesizer. Inst...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
International audienceSource-filter paradigm is one of the most common approaches used by the scient...
Parametric speech synthesis has received increased attention in recent years following the developme...
Comunicació i pòster presentats a l'Interspeech 2017 celebrat del 20 al 24 d'agost a Estocolm, Suèci...
Estimation of glottal source information can be performed non-invasively from speech by using glotta...
Recent studies have shown that text-to-speech synthesis quality can be improved by using glottal voc...
Recent speech technology research has seen a growing interest in using WaveNets as statistical vocod...
Neural network-based models that generate glottal excitation waveforms from acoustic features have b...
The state-of-the-art in text-to-speech (TTS) synthesis has recently improved considerably due to nov...
Glottal volume velocity waveform, the acoustical excitation of voiced speech, cannot be acquired thr...
This paper proposes the use of the Liljencrants-Fant model (LFmodel) to represent the glottal source...
This paper proposes the use of the Liljencrants-Fant model (LF-model) to represent the glottal sourc...
A vocoder is used to express a speech waveform with a controllable parametric representation that ca...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
We have studied the analysis and synthesis of speech with a glottal-excited speech synthesizer. Inst...
We recently presented a new model for singing synthesis based on a modified version of the WaveNet a...
International audienceSource-filter paradigm is one of the most common approaches used by the scient...
Parametric speech synthesis has received increased attention in recent years following the developme...
Comunicació i pòster presentats a l'Interspeech 2017 celebrat del 20 al 24 d'agost a Estocolm, Suèci...
Estimation of glottal source information can be performed non-invasively from speech by using glotta...