User-defined keyword spotting is a task to detect new spoken terms defined by users. This can be viewed as a few-shot learning problem since it is unreasonable for users to define their desired keywords by providing many examples. To solve this problem, previous works try to incorporate self-supervised learning models or apply meta-learning algorithms. But it is unclear whether self-supervised learning and meta-learning are complementary and which combination of the two types of approaches is most effective for few-shot keyword discovery. In this work, we systematically study these questions by utilizing various self-supervised learning models and combining them with a wide variety of meta-learning algorithms. Our result shows that HuBERT c...
The success of deep learning methods hinges on the availability of large training datasets annotated...
Few-shot Learning (FSL) is aimed to make predictions based on a limited number of samples. Structure...
This study presents a novel zero-shot user-defined keyword spotting model that utilizes the audio-ph...
For training a few-shot keyword spotting (FS-KWS) model, a large labeled dataset containing massive ...
In recent years, the development of accurate deep keyword spotting (KWS) models has resulted in KWS ...
Humans show a remarkable capability to accurately solve a wide range of problems efficiently -- util...
Few-shot learning (FSL) aims to generate a classifier using limited labeled examples. Many existing ...
In this paper, we consider the framework of multi-task representation (MTR) learning where the goal ...
Few-shot classification aims to adapt to new tasks with limited labeled examples. To fully use the a...
Recognizing a particular command or a keyword, keyword spotting has been widely used in many voice i...
We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-...
Model-agnostic meta-learning (MAML) is arguably one of the most popular meta-learning algorithms now...
Meta-learning has been shown to be an effective strategy for few-shot learning. The key idea is to l...
Despite achieving state-of-the-art zero-shot performance, existing vision-language models still fall...
A two-stage training paradigm consisting of sequential pre-training and meta-training stages has bee...
The success of deep learning methods hinges on the availability of large training datasets annotated...
Few-shot Learning (FSL) is aimed to make predictions based on a limited number of samples. Structure...
This study presents a novel zero-shot user-defined keyword spotting model that utilizes the audio-ph...
For training a few-shot keyword spotting (FS-KWS) model, a large labeled dataset containing massive ...
In recent years, the development of accurate deep keyword spotting (KWS) models has resulted in KWS ...
Humans show a remarkable capability to accurately solve a wide range of problems efficiently -- util...
Few-shot learning (FSL) aims to generate a classifier using limited labeled examples. Many existing ...
In this paper, we consider the framework of multi-task representation (MTR) learning where the goal ...
Few-shot classification aims to adapt to new tasks with limited labeled examples. To fully use the a...
Recognizing a particular command or a keyword, keyword spotting has been widely used in many voice i...
We introduce MetaICL (Meta-training for In-Context Learning), a new meta-training framework for few-...
Model-agnostic meta-learning (MAML) is arguably one of the most popular meta-learning algorithms now...
Meta-learning has been shown to be an effective strategy for few-shot learning. The key idea is to l...
Despite achieving state-of-the-art zero-shot performance, existing vision-language models still fall...
A two-stage training paradigm consisting of sequential pre-training and meta-training stages has bee...
The success of deep learning methods hinges on the availability of large training datasets annotated...
Few-shot Learning (FSL) is aimed to make predictions based on a limited number of samples. Structure...
This study presents a novel zero-shot user-defined keyword spotting model that utilizes the audio-ph...