Emotional prosody model building is very important for emotional speech synthesis. However, in the courses of researches, it is a serious problem that the quantity of emotional data is much less than neutral data. The corpus including three emotions, i.e. happiness, anger and sadness, is built in this paper. The parameters that affect the emotional prosody are analyzed and an emotional prosody model based on neural network is built. In the process of training the prosody model, because emotional corpus is too small, the problem of over-fitting caused by data sparsity will occur. In order to utilize the large-scale neutral data to improve the quality of emotional prosody model, three methods are proposed, namely, the method of mixed corpus, ...
Abstract The performance of speech recognition systems trained with neutral utterances degrades sign...
Artificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were appli...
The goal of the project is to detect the speaker's emotions while he or she speaks. Speech generated...
Abstract. The paper analyzes the prosody features, which includes the intonation, speaking rate, int...
This paper proposes a system to convert neutral speech to emotional with controlled intensity of emo...
Speech emotion recognition is a crucial work direction in speech recognition. To increase the perfor...
Abstract This paper is related to the method of adding a emotional speech corpus to a high-quality l...
Abstract. This paper presents a technique for synthesizing emotional speech based on an emotion-inde...
UnrestrictedEmotions play an important role in human life. They are essential for communication, for...
Prosodic features have been proven important to discriminate between different speech emotions, but ...
The expression of emotions in human communication plays a very important role in the information tha...
Research has focused on using prosody as an alternative source of information for language ...
Abstract. The paper analyzes the prosody features, which includes the intonation, speaking rate, int...
Research has focused on using prosody as an alternative source of information for language modeling....
The ability to understand people through spoken language is a skill that many human beings take for ...
Abstract The performance of speech recognition systems trained with neutral utterances degrades sign...
Artificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were appli...
The goal of the project is to detect the speaker's emotions while he or she speaks. Speech generated...
Abstract. The paper analyzes the prosody features, which includes the intonation, speaking rate, int...
This paper proposes a system to convert neutral speech to emotional with controlled intensity of emo...
Speech emotion recognition is a crucial work direction in speech recognition. To increase the perfor...
Abstract This paper is related to the method of adding a emotional speech corpus to a high-quality l...
Abstract. This paper presents a technique for synthesizing emotional speech based on an emotion-inde...
UnrestrictedEmotions play an important role in human life. They are essential for communication, for...
Prosodic features have been proven important to discriminate between different speech emotions, but ...
The expression of emotions in human communication plays a very important role in the information tha...
Research has focused on using prosody as an alternative source of information for language ...
Abstract. The paper analyzes the prosody features, which includes the intonation, speaking rate, int...
Research has focused on using prosody as an alternative source of information for language modeling....
The ability to understand people through spoken language is a skill that many human beings take for ...
Abstract The performance of speech recognition systems trained with neutral utterances degrades sign...
Artificial Neural Network (ANN) models, specifically Convolutional Neural Networks (CNN), were appli...
The goal of the project is to detect the speaker's emotions while he or she speaks. Speech generated...