Progress in speech processing has been facilitated by shared datasets and benchmarks. Historically these have focused on automatic speech recognition (ASR), speaker identification, or other lower-level tasks. Interest has been growing in higher-level spoken language understanding tasks, including using end-to-end models, but there are fewer annotated datasets for such tasks. At the same time, recent work shows the possibility of pre-training generic representations and then fine-tuning for several tasks using relatively little labeled data. We propose to create a suite of benchmark tasks for Spoken Language Understanding Evaluation (SLUE) consisting of limited-size labeled training sets and corresponding evaluation sets. This resource would...
In the past two decades there have been several projects on Spoken Language Understanding (SLU). I...
This resource contains supplementary audio material used in our EMNLP 2020 paper where we release SL...
Spoken language understanding (SLU) is a task aiming to extract high-level semantics from spoken utt...
Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image an...
Spoken language understanding (SLU) tasks involve mapping from speech audio signals to semantic labe...
Spoken language understanding (SLU) topic has seen a lot of progress these last three years, with th...
International audienceEmpirical evaluation is nowadays the main evaluation paradigm in Natural Langu...
End-to-end spoken language understanding (SLU) predicts intent directly from audio using a single mo...
Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different doma...
Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to atta...
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech...
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech r...
We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models ...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
We propose a novel deliberation-based approach to end-to-end (E2E) spoken language understanding (SL...
In the past two decades there have been several projects on Spoken Language Understanding (SLU). I...
This resource contains supplementary audio material used in our EMNLP 2020 paper where we release SL...
Spoken language understanding (SLU) is a task aiming to extract high-level semantics from spoken utt...
Self-Supervised Learning (SSL) using huge unlabeled data has been successfully explored for image an...
Spoken language understanding (SLU) tasks involve mapping from speech audio signals to semantic labe...
Spoken language understanding (SLU) topic has seen a lot of progress these last three years, with th...
International audienceEmpirical evaluation is nowadays the main evaluation paradigm in Natural Langu...
End-to-end spoken language understanding (SLU) predicts intent directly from audio using a single mo...
Self-supervised learning (SSL) is at the origin of unprecedented improvements in many different doma...
Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to atta...
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech...
We introduce the Universal Speech Model (USM), a single large model that performs automatic speech r...
We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models ...
Self-supervised representation learning (SSRL) has improved the performance on downstream phoneme re...
We propose a novel deliberation-based approach to end-to-end (E2E) spoken language understanding (SL...
In the past two decades there have been several projects on Spoken Language Understanding (SLU). I...
This resource contains supplementary audio material used in our EMNLP 2020 paper where we release SL...
Spoken language understanding (SLU) is a task aiming to extract high-level semantics from spoken utt...