Freely available audiobooks are a rich resource of expressive speech recordings that can be used for the purposes of speech synthesis. Natural sounding, expressive synthetic voices have previously been built from audiobooks that contained large amounts of highly expressive speech recorded from a profes- sionally trained speaker. The majority of freely available au- diobooks, however, are read by amateur speakers, are shorter and contain less expressive (less emphatic, less emotional, etc.) speech both in terms of quality and quantity. Synthesiz- ing expressive speech from a typical online audiobook there- fore poses many challenges. In this work we address these challenges by applying a method consisting of minimally su- pervised techniques...
In modern days synthesis of human images and videos is arguably one of the most popular topics in th...
This work aims at creating expressive voices from audiobooks using semantic selection. First, for ea...
Abstract In this paper, we explore how to construct stylistic TTS databases from audio books, in whi...
Freely available audiobooks are a rich resource of expressive speech recordings that can be used for...
Expressive synthesis from text is a challenging problem. There are two issues. First, read text is o...
Audiobooks are a powerful source of rich information for speech synthesis. Recent work has been foc...
In this work we design an approach for automatic feature selection and voice creation for expressive...
While text-to-speech has long been centered on the production of an intelligible message of good qua...
This paper describes recent progress in our approach to generating expressive speech. A goal of text...
One of the biggest challenges in speech synthesis is the production of naturally sounding synthetic ...
In this thesis, we study the expressivity of read speech with a particular type of data, which are...
While text-to-speech has long been centered on the production of an intelligible message of good qua...
This work presents a study on the suitability of prosodic andacoustic features, with a special focus...
Generating expressive, naturally sounding, speech from text using a speech synthesis (TTS) system is...
When designing human-machine interfaces it is important to consider not only the bare bones function...
In modern days synthesis of human images and videos is arguably one of the most popular topics in th...
This work aims at creating expressive voices from audiobooks using semantic selection. First, for ea...
Abstract In this paper, we explore how to construct stylistic TTS databases from audio books, in whi...
Freely available audiobooks are a rich resource of expressive speech recordings that can be used for...
Expressive synthesis from text is a challenging problem. There are two issues. First, read text is o...
Audiobooks are a powerful source of rich information for speech synthesis. Recent work has been foc...
In this work we design an approach for automatic feature selection and voice creation for expressive...
While text-to-speech has long been centered on the production of an intelligible message of good qua...
This paper describes recent progress in our approach to generating expressive speech. A goal of text...
One of the biggest challenges in speech synthesis is the production of naturally sounding synthetic ...
In this thesis, we study the expressivity of read speech with a particular type of data, which are...
While text-to-speech has long been centered on the production of an intelligible message of good qua...
This work presents a study on the suitability of prosodic andacoustic features, with a special focus...
Generating expressive, naturally sounding, speech from text using a speech synthesis (TTS) system is...
When designing human-machine interfaces it is important to consider not only the bare bones function...
In modern days synthesis of human images and videos is arguably one of the most popular topics in th...
This work aims at creating expressive voices from audiobooks using semantic selection. First, for ea...
Abstract In this paper, we explore how to construct stylistic TTS databases from audio books, in whi...