Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems lack generalisation across different conditions. A key underlying reason for poor generalisation is the scarcity of emotion datasets, which is a significant roadblock to designing robust machine learning (ML) models. Recent works in SER focus on utilising multitask learning (MTL) methods to improve generalisation by learning shared representations. However, most of these studies propose MTL solutions with the requirement of meta labels for auxiliary tasks, which limits the training of SER systems. This paper proposes an MTL framework (MTL-AUG) that learns generalised representations from augmented data. We utilise augmentation-type classification and u...
To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition ...
The recognition of emotions in speech is one of the most challenging topics in data science. In this...
Obtaining large, human labelled speech datasets to train models for emotion recognition is a notorio...
Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems lack gener...
Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems lack gener...
Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art accuracy i...
Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art accuracy i...
The absence of labeled samples limits the development of speech emotion recognition (SER). Data augm...
Despite the recent advancement in speech emotion recognition (SER) within a single corpus setting, t...
Speech emotion recognition (SER), a rapidly evolving task that aims to recognize the emotion of spea...
Generative adversarial networks (GANs) have shown potential in learning emotional attributes and gen...
In this paper, we focus on a challenging, but interesting, task in speech emotion recognition (SER),...
Despite the widespread use of supervised learning methods for speech emotion recognition, they are s...
AbstractIn this paper, we investigate an interesting problem, i.e., unsupervised cross-corpus speech...
In many practical applications, a speech emotion recognition model learned on a source (training) do...
To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition ...
The recognition of emotions in speech is one of the most challenging topics in data science. In this...
Obtaining large, human labelled speech datasets to train models for emotion recognition is a notorio...
Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems lack gener...
Despite the recent progress in speech emotion recognition (SER), state-of-the-art systems lack gener...
Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art accuracy i...
Inspite the emerging importance of Speech Emotion Recognition (SER), the state-of-the-art accuracy i...
The absence of labeled samples limits the development of speech emotion recognition (SER). Data augm...
Despite the recent advancement in speech emotion recognition (SER) within a single corpus setting, t...
Speech emotion recognition (SER), a rapidly evolving task that aims to recognize the emotion of spea...
Generative adversarial networks (GANs) have shown potential in learning emotional attributes and gen...
In this paper, we focus on a challenging, but interesting, task in speech emotion recognition (SER),...
Despite the widespread use of supervised learning methods for speech emotion recognition, they are s...
AbstractIn this paper, we investigate an interesting problem, i.e., unsupervised cross-corpus speech...
In many practical applications, a speech emotion recognition model learned on a source (training) do...
To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition ...
The recognition of emotions in speech is one of the most challenging topics in data science. In this...
Obtaining large, human labelled speech datasets to train models for emotion recognition is a notorio...