Most HMM-based TTS systems use a hard voiced/unvoiced classification to produce a discontinuous F0 signal which is used for the generation of the source-excitation. When a mixed source excitation is used, this decision can be based on two different sources of information: the state-specific MSD-prior of the F0 models, and/or the frame-specific features generated by the aperiodicity model. This paper examines the meaning of these variables in the synthesis process, their interaction, and how they affect the perceived quality of the generated speech The results of several perceptual experiments show that when using mixed excitation, subjects consistently prefer samples with very few or no false unvoiced errors, whereas a reduction in the rate...
Although Hidden Markov Model based speech synthesis has been proved to have good performance, there ...
This article focuses on developing a system for high-quality synthesized and converted speech by add...
This paper introduces a novel method to improve the U/V decision method in HMM-based speech synthesi...
Fundamental frequency, or F0 is critical for high quality speech synthesis in HMM based speech synth...
This paper describes a trainable excitation approach to eliminate the unnaturalness of HMM-based spe...
HMM-based speech synthesis offers a way to generate speech with different voice qualities. However, ...
ICASSP2009: IEEE International Conference on Acoustics, Speech, and Signal Processing, April 19-24...
HMM-based speech synthesis generally suffers from typical buzzi-ness due to over-simplified excitati...
Abstract—The quality of speech generated from Hidden Markov Model (HMM)-based Speech Synthesis Syste...
INTERSPEECH2010: 11th Annual Conference of the International Speech Communication Association, Septe...
The most important advantage of HMM-based TTS is its highly intelligible. However, speech synthesize...
This paper proposes the use of the Liljencrants-Fant model (LFmodel) to represent the glottal source...
Parametric speech synthesis has received increased attention in recent years following the developme...
The excitation for LPC speech synthesis usually consists of two separate signals- a delta-function p...
Speech technology can facilitate human-machine interaction and create new communi-cation interfaces....
Although Hidden Markov Model based speech synthesis has been proved to have good performance, there ...
This article focuses on developing a system for high-quality synthesized and converted speech by add...
This paper introduces a novel method to improve the U/V decision method in HMM-based speech synthesi...
Fundamental frequency, or F0 is critical for high quality speech synthesis in HMM based speech synth...
This paper describes a trainable excitation approach to eliminate the unnaturalness of HMM-based spe...
HMM-based speech synthesis offers a way to generate speech with different voice qualities. However, ...
ICASSP2009: IEEE International Conference on Acoustics, Speech, and Signal Processing, April 19-24...
HMM-based speech synthesis generally suffers from typical buzzi-ness due to over-simplified excitati...
Abstract—The quality of speech generated from Hidden Markov Model (HMM)-based Speech Synthesis Syste...
INTERSPEECH2010: 11th Annual Conference of the International Speech Communication Association, Septe...
The most important advantage of HMM-based TTS is its highly intelligible. However, speech synthesize...
This paper proposes the use of the Liljencrants-Fant model (LFmodel) to represent the glottal source...
Parametric speech synthesis has received increased attention in recent years following the developme...
The excitation for LPC speech synthesis usually consists of two separate signals- a delta-function p...
Speech technology can facilitate human-machine interaction and create new communi-cation interfaces....
Although Hidden Markov Model based speech synthesis has been proved to have good performance, there ...
This article focuses on developing a system for high-quality synthesized and converted speech by add...
This paper introduces a novel method to improve the U/V decision method in HMM-based speech synthesi...