Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets. These models are typically pre-trained in self-supervised manner with the goal to improve automatic speech recognition performance -- and thus, to understand linguistic information. In this work, we investigate the extent in which this information is exploited during SER fine-tuning. Using a reproducible methodology based on open-source tools, we synthesise prosodically neutral speech utterances while varying the sentiment of the text. Valence predictions of the transformer model are very reactive to positive and negative sentiment content, as well as nega...
Speech Emotion Recognition (SER) plays a pivotal role in enhancing human-computer interaction by ena...
Speech Emotion Recognition (SER) makes it possible for machines to perceive affective information. O...
Creating machines with the ability to reason, perceive, learn and make decisions based on a human li...
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently ...
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently ...
Recent advances in transformer-based architectures which are pre-trained in self-supervised manner h...
Speech Emotion Recognition (SER) has been shown to benefit from many of the recent advances in deep ...
Human emotion understanding is pivotal in making conversational technology mainstream. We view speec...
Automatic speech recognition is an active field of study in artificial intelligence and machine lear...
Large language models, in particular generative pre-trained transformers (GPTs), show impressive res...
We propose EmoDistill, a novel speech emotion recognition (SER) framework that leverages cross-modal...
© 2022, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for ...
Self-supervised speech models have grown fast during the past few years and have proven feasible for...
Artificial intelligence (AI) has had a significant impact on various industries and sectors of socie...
Speech emotion recognition is essential for obtaining emotional intelligence which affects the under...
Speech Emotion Recognition (SER) plays a pivotal role in enhancing human-computer interaction by ena...
Speech Emotion Recognition (SER) makes it possible for machines to perceive affective information. O...
Creating machines with the ability to reason, perceive, learn and make decisions based on a human li...
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently ...
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently ...
Recent advances in transformer-based architectures which are pre-trained in self-supervised manner h...
Speech Emotion Recognition (SER) has been shown to benefit from many of the recent advances in deep ...
Human emotion understanding is pivotal in making conversational technology mainstream. We view speec...
Automatic speech recognition is an active field of study in artificial intelligence and machine lear...
Large language models, in particular generative pre-trained transformers (GPTs), show impressive res...
We propose EmoDistill, a novel speech emotion recognition (SER) framework that leverages cross-modal...
© 2022, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for ...
Self-supervised speech models have grown fast during the past few years and have proven feasible for...
Artificial intelligence (AI) has had a significant impact on various industries and sectors of socie...
Speech emotion recognition is essential for obtaining emotional intelligence which affects the under...
Speech Emotion Recognition (SER) plays a pivotal role in enhancing human-computer interaction by ena...
Speech Emotion Recognition (SER) makes it possible for machines to perceive affective information. O...
Creating machines with the ability to reason, perceive, learn and make decisions based on a human li...