SUPERB was proposed to evaluate the generalizability of self-supervised learning (SSL) speech models across various tasks. However, it incurs high computational costs due to the large datasets and diverse tasks. In this paper, we introduce MiniSUPERB, a lightweight benchmark that efficiently evaluates SSL speech models with comparable results to SUPERB but lower computational costs significantly. We carefully select representative tasks, sample datasets, and extract model representations offline. Our approach achieves a Spearman's rank correlation of 0.954 and 0.982 with SUPERB Paper and SUPERB Challenge, respectively. Additionally, we reduce the computational cost by 97% in terms of Multiply-ACcumulate operations (MACs). Furthermore, we ev...
In recent years, self-supervised learning paradigm has received extensive attention due to its great...
Self-supervised learning (SSL) representation for speech has achieved state-of-the-art (SOTA) perfor...
The modern paradigm in speech processing has demonstrated the importance of scale and compute for en...
Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing...
In recent years, speech-based self-supervised learning (SSL) has made significant progress in variou...
Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different doma...
Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image an...
Large-scale speech self-supervised learning (SSL) has emerged to the main field of speech processing...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
Self-supervised learning (SSL) achieves great success in speech recognition, while limited explorati...
Self-supervised learning (SSL) methods such as WavLM have shown promising speech separation (SS) res...
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech...
Self-supervised learning (SSL) for rich speech representations has achieved empirical success in low...
We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models ...
While FastSpeech2 aims to integrate aspects of speech such as pitch, energy, and duration as conditi...
In recent years, self-supervised learning paradigm has received extensive attention due to its great...
Self-supervised learning (SSL) representation for speech has achieved state-of-the-art (SOTA) perfor...
The modern paradigm in speech processing has demonstrated the importance of scale and compute for en...
Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing...
In recent years, speech-based self-supervised learning (SSL) has made significant progress in variou...
Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different doma...
Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image an...
Large-scale speech self-supervised learning (SSL) has emerged to the main field of speech processing...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
Self-supervised learning (SSL) achieves great success in speech recognition, while limited explorati...
Self-supervised learning (SSL) methods such as WavLM have shown promising speech separation (SS) res...
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech...
Self-supervised learning (SSL) for rich speech representations has achieved empirical success in low...
We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models ...
While FastSpeech2 aims to integrate aspects of speech such as pitch, energy, and duration as conditi...
In recent years, self-supervised learning paradigm has received extensive attention due to its great...
Self-supervised learning (SSL) representation for speech has achieved state-of-the-art (SOTA) perfor...
The modern paradigm in speech processing has demonstrated the importance of scale and compute for en...