This paper proposes architectures that facilitate the extrapolation of emotional expressions in deep neural network (DNN)-based text-to-speech (TTS). In this study, the meaning of “extrapolate emotional expressions” is to borrow emotional expressions from others, and the collection of emotional speech uttered by target speakers is unnecessary. Although a DNN has potential power to construct DNN-based TTS with emotional expressions and some DNN-based TTS systems have demonstrated satisfactory performances in the expression of the diversity of human speech, it is necessary and troublesome to collect emotional speech uttered by target speakers. To solve this issue, we propose architectures to separately train the speaker feature and the emotio...
Several attempts have been made to synthesize speech from text. However, existing methods tend to ge...
Speech can express subjective meanings and intents that, in order to be fully understood, rely heavi...
Speech emotion recognition (SER) is currently a research hotspot due to its challenging nature but b...
In this work we try to perform emotional style transfer on audios. In particular, MelGAN-VC architec...
International audienceGreat improvement has been made in the field of expressive audiovisual Text-to...
Emotional prosody model building is very important for emotional speech synthesis. However, in the c...
This paper proposes an effective emotional text-to-speech (TTS) system with a pre-trained language m...
International audienceSpeech emotion conversion is the task of modifying the perceived emotion of a ...
There is an apparent evolving interest in speech emotion recognition (SER), one of the particular c...
This thesis presents the work of incorporating facial animation with emotions into a neural text-to-...
Recently, text-to-speech (TTS) synthesis has gained immense success in the human-computer interactio...
The expression of emotions in human communication plays a very important role in the information tha...
Abstract. This paper presents a technique for synthesizing emotional speech based on an emotion-inde...
In modern days synthesis of human images and videos is arguably one of the most popular topics in th...
The ability to understand people through spoken language is a skill that many human beings take for ...
Several attempts have been made to synthesize speech from text. However, existing methods tend to ge...
Speech can express subjective meanings and intents that, in order to be fully understood, rely heavi...
Speech emotion recognition (SER) is currently a research hotspot due to its challenging nature but b...
In this work we try to perform emotional style transfer on audios. In particular, MelGAN-VC architec...
International audienceGreat improvement has been made in the field of expressive audiovisual Text-to...
Emotional prosody model building is very important for emotional speech synthesis. However, in the c...
This paper proposes an effective emotional text-to-speech (TTS) system with a pre-trained language m...
International audienceSpeech emotion conversion is the task of modifying the perceived emotion of a ...
There is an apparent evolving interest in speech emotion recognition (SER), one of the particular c...
This thesis presents the work of incorporating facial animation with emotions into a neural text-to-...
Recently, text-to-speech (TTS) synthesis has gained immense success in the human-computer interactio...
The expression of emotions in human communication plays a very important role in the information tha...
Abstract. This paper presents a technique for synthesizing emotional speech based on an emotion-inde...
In modern days synthesis of human images and videos is arguably one of the most popular topics in th...
The ability to understand people through spoken language is a skill that many human beings take for ...
Several attempts have been made to synthesize speech from text. However, existing methods tend to ge...
Speech can express subjective meanings and intents that, in order to be fully understood, rely heavi...
Speech emotion recognition (SER) is currently a research hotspot due to its challenging nature but b...