This work aims at creating expressive voices from audiobooks using semantic selection. First, for each utterance of the audiobook an acoustic feature vector is extracted, including iVectors built on MFCC and on F0 basis. Then, the transcription is projected into a semantic vector space. A seed utterance is projected to the semantic vector space and the N nearest neighbors are selected. The selection is then filtered by selecting only acoustically similar data. The proposed technique can be used to train emotional voices by using emotional keywords or phrases as seeds, obtaining training data semantically similar to the seed. It can also be used to read larger texts in an expressive manner, creating specific voices for each sentence. Th...
International audienceThis paper presents algorithms that allow a robot to express its emotions by m...
Getting a text to speech synthesis (TTS) system to speak lively animated stories like a human is ver...
International audienceIn the field of expressive speech synthesis, a lot of work has been conducted ...
The goal of the study is to predict acoustic features of expressive speech from semantic vector spac...
Expressive synthesis from text is a challenging problem. There are two issues. First, read text is o...
In this work we design an approach for automatic feature selection and voice creation for expressive...
This work presents a study on the suitability of prosodic andacoustic features, with a special focus...
Nowadays, especially with the upswing of neural networks, speech synthesis is almost totally data dr...
Freely available audiobooks are a rich resource of expressive speech recordings that can be used for...
Generating expressive, naturally sounding, speech from text using a speech synthesis (TTS) system is...
Audiobooks are a powerful source of rich information for speech synthesis. Recent work has been foc...
In this paper we present a DNN based speech synthesis system trained on an audiobook including senti...
To obtain a robust acoustic model for a certain speech recognition task, a large amount of speech da...
Speech Emotion Recognition (SER) makes it possible for machines to perceive affective information. O...
International audienceGreat improvement has been made in the field of expressive audiovisual Text-to...
International audienceThis paper presents algorithms that allow a robot to express its emotions by m...
Getting a text to speech synthesis (TTS) system to speak lively animated stories like a human is ver...
International audienceIn the field of expressive speech synthesis, a lot of work has been conducted ...
The goal of the study is to predict acoustic features of expressive speech from semantic vector spac...
Expressive synthesis from text is a challenging problem. There are two issues. First, read text is o...
In this work we design an approach for automatic feature selection and voice creation for expressive...
This work presents a study on the suitability of prosodic andacoustic features, with a special focus...
Nowadays, especially with the upswing of neural networks, speech synthesis is almost totally data dr...
Freely available audiobooks are a rich resource of expressive speech recordings that can be used for...
Generating expressive, naturally sounding, speech from text using a speech synthesis (TTS) system is...
Audiobooks are a powerful source of rich information for speech synthesis. Recent work has been foc...
In this paper we present a DNN based speech synthesis system trained on an audiobook including senti...
To obtain a robust acoustic model for a certain speech recognition task, a large amount of speech da...
Speech Emotion Recognition (SER) makes it possible for machines to perceive affective information. O...
International audienceGreat improvement has been made in the field of expressive audiovisual Text-to...
International audienceThis paper presents algorithms that allow a robot to express its emotions by m...
Getting a text to speech synthesis (TTS) system to speak lively animated stories like a human is ver...
International audienceIn the field of expressive speech synthesis, a lot of work has been conducted ...