Our reasearch goal is to construct a Japanese TTS (Text-to-Speech) system that can output various kinds of prosody. Since such synthetic speech is useful for a practical use, many TTS systems have implemented global prosodic control processing. But fundamentally they’re designed to output speech with standard pitch and speech rate. We discuss synthesis method for high quality speech with extreme prosody (very high, low, fast and slow) from a viewpoint of a speech database. As a speech synthesis method, we employ a unit selection-concatenation method. We also introduce an analysis-synthesis process to give precise target prosody to output speech. Many research has reported that speech quality get worse in proportion to an amount of prosody m...
Text to speech synthesis (TTS) is the production of artificial speech by a machine for the given tex...
Schuurman, I., & Vandeghinste.International audienceThis paper reports on prosodic evaluation in the...
Speech generation is the process which allows the transformation of a string of phonetic and prosodi...
LREC2004: the 4th International Conference on Language Resources and Evaluation, May 24-30, 2004, L...
This research aims to construct a high-quality Japanese TTS (Text-to-Speech) system that has high fl...
ICSLP2002: the 7th International Conference on Spoken Language Processing , September 16-20, 2002, ...
Improving the naturalness of synthetic speech is an essential task in developing a text-to-speech (T...
Currently working at Asahi Kasei Cooperation For the purpose of building speech synthesis system tha...
Abstract: This paper presents research on the use of Fujisaki parameters for the quality prediction ...
LREC2002: International Conference on Language Resources and Evaluation, May 29-31, 2002, Las Palma...
A prosody adaptation text-to-speech system based on concatenation of spoken English words is present...
The perceived quality of synthetic speech strongly depends on its prosodic naturalness. Departing fr...
The ultimate goal of text-to-speech synthesis is to convert ordinary orthographic text into an acous...
Baumann T, Schlangen D. Evaluating Prosodic Processing for Incremental Speech Synthesis. In: Procee...
The lack of prosody variation in text-to-speech systems contributes to their perceived unnaturalness...
Text to speech synthesis (TTS) is the production of artificial speech by a machine for the given tex...
Schuurman, I., & Vandeghinste.International audienceThis paper reports on prosodic evaluation in the...
Speech generation is the process which allows the transformation of a string of phonetic and prosodi...
LREC2004: the 4th International Conference on Language Resources and Evaluation, May 24-30, 2004, L...
This research aims to construct a high-quality Japanese TTS (Text-to-Speech) system that has high fl...
ICSLP2002: the 7th International Conference on Spoken Language Processing , September 16-20, 2002, ...
Improving the naturalness of synthetic speech is an essential task in developing a text-to-speech (T...
Currently working at Asahi Kasei Cooperation For the purpose of building speech synthesis system tha...
Abstract: This paper presents research on the use of Fujisaki parameters for the quality prediction ...
LREC2002: International Conference on Language Resources and Evaluation, May 29-31, 2002, Las Palma...
A prosody adaptation text-to-speech system based on concatenation of spoken English words is present...
The perceived quality of synthetic speech strongly depends on its prosodic naturalness. Departing fr...
The ultimate goal of text-to-speech synthesis is to convert ordinary orthographic text into an acous...
Baumann T, Schlangen D. Evaluating Prosodic Processing for Incremental Speech Synthesis. In: Procee...
The lack of prosody variation in text-to-speech systems contributes to their perceived unnaturalness...
Text to speech synthesis (TTS) is the production of artificial speech by a machine for the given tex...
Schuurman, I., & Vandeghinste.International audienceThis paper reports on prosodic evaluation in the...
Speech generation is the process which allows the transformation of a string of phonetic and prosodi...