Abstract In this paper, we explore how to construct stylistic TTS databases from audio books, in which a storyteller performs multiple roles. The goal is to identify and build a set of speech corpora, each of which not only portrays a representative voice style performed by the speaker, but also has sufficient sentences to synthesize natural speech using unit selection approach. We solve the problem in two procedures: first, by representing each role with Gaussian Mixture Models (GMM), all speech data are partitioned into a number of voice style clusters with a criterion that maximizes the likelihood of all utterances with respect to roles' speaker models; then, pruning in terms of both acoustic and prosodic measures is followed to pur...
In this thesis, we study the expressivity of read speech with a particular type of data, which are...
A general data-driven procedure for creating new prosodic modules for the Italian FESTIVAL Text-To-S...
This review gives a general overview of techniques used in statistical parametric speech synthesis. ...
In this work we design an approach for automatic feature selection and voice creation for expressive...
Freely available audiobooks are a rich resource of expressive speech recordings that can be used for...
One of the biggest challenges in speech synthesis is the production of contextually-appropriate natu...
The goal of this study was to determine which acoustic parameters are significant in differentiating...
One of the biggest challenges in speech synthesis is the production of contextually-appropriate natu...
Expressive synthesis from text is a challenging problem. There are two issues. First, read text is o...
Audiobooks are a powerful source of rich information for speech synthesis. Recent work has been foc...
Text-to-speech synthesis (TTS) has progressed to such a stage that given a large, clean, phoneticall...
Dans ces travaux de thèse nous abordons l'expressivité de la parole lue avec un type de données part...
Text-to-speech synthesis (TTS) turns a written text into an audio speech signal. Many commercial sys...
This work presents a study on the suitability of prosodic andacoustic features, with a special focus...
When creating voices for concatenative speech synthesis, several hours of speech uttered by a profes...
In this thesis, we study the expressivity of read speech with a particular type of data, which are...
A general data-driven procedure for creating new prosodic modules for the Italian FESTIVAL Text-To-S...
This review gives a general overview of techniques used in statistical parametric speech synthesis. ...
In this work we design an approach for automatic feature selection and voice creation for expressive...
Freely available audiobooks are a rich resource of expressive speech recordings that can be used for...
One of the biggest challenges in speech synthesis is the production of contextually-appropriate natu...
The goal of this study was to determine which acoustic parameters are significant in differentiating...
One of the biggest challenges in speech synthesis is the production of contextually-appropriate natu...
Expressive synthesis from text is a challenging problem. There are two issues. First, read text is o...
Audiobooks are a powerful source of rich information for speech synthesis. Recent work has been foc...
Text-to-speech synthesis (TTS) has progressed to such a stage that given a large, clean, phoneticall...
Dans ces travaux de thèse nous abordons l'expressivité de la parole lue avec un type de données part...
Text-to-speech synthesis (TTS) turns a written text into an audio speech signal. Many commercial sys...
This work presents a study on the suitability of prosodic andacoustic features, with a special focus...
When creating voices for concatenative speech synthesis, several hours of speech uttered by a profes...
In this thesis, we study the expressivity of read speech with a particular type of data, which are...
A general data-driven procedure for creating new prosodic modules for the Italian FESTIVAL Text-To-S...
This review gives a general overview of techniques used in statistical parametric speech synthesis. ...