Prompt-Tuning is a new paradigm for finetuning pre-trained language models in a parameter-efficient way. Here, we explore the use of HyperNetworks to generate hyper-prompts: we propose HyperPrompt, a novel architecture for prompt-based task-conditioning of self-attention in Transformers. The hyper-prompts are end-to-end learnable via generation by a HyperNetwork. HyperPrompt allows the network to learn task-specific feature maps where the hyper-prompts serve as task global memories for the queries to attend to, at the same time enabling flexible information sharing among tasks. We show that HyperPrompt is competitive against strong multi-task learning baselines with as few as $0.14\%$ of additional task-conditioning parameters, achieving gr...
Transformer networks have seen great success in natural language processing and machine vision, wher...
Transformer models cannot easily scale to long sequences due to their O(N^2) time and space complexi...
Prompting has shown impressive success in enabling large pretrained language models (LMs) to perform...
We propose structured prompt tuning, a simple and effective method to improve prompt tuning. Instead...
Transformer-based architectures are the model of choice for natural language understanding, but they...
We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-effici...
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech...
Parameter-efficient fine-tuning (PEFT) has shown its effectiveness in adapting the pre-trained langu...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Prompt tuning learns soft prompts to condition frozen Pre-trained Language Models (PLMs) for perform...
Prompt tuning (PT) is an effective approach to adapting pre-trained language models to downstream ta...
We evaluate three simple, normalization-centric changes to improve Transformer training. First, we s...
Recent works have shown promising results of prompt tuning in stimulating pre-trained language model...
The transformer architecture and variants presented remarkable success across many machine learning ...
The current modus operandi in adapting pre-trained models involves updating all the backbone paramet...
Transformer networks have seen great success in natural language processing and machine vision, wher...
Transformer models cannot easily scale to long sequences due to their O(N^2) time and space complexi...
Prompting has shown impressive success in enabling large pretrained language models (LMs) to perform...
We propose structured prompt tuning, a simple and effective method to improve prompt tuning. Instead...
Transformer-based architectures are the model of choice for natural language understanding, but they...
We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-effici...
Speech representations learned from Self-supervised learning (SSL) models can benefit various speech...
Parameter-efficient fine-tuning (PEFT) has shown its effectiveness in adapting the pre-trained langu...
Fine-tuning large language models for different tasks can be costly and inefficient, and even method...
Prompt tuning learns soft prompts to condition frozen Pre-trained Language Models (PLMs) for perform...
Prompt tuning (PT) is an effective approach to adapting pre-trained language models to downstream ta...
We evaluate three simple, normalization-centric changes to improve Transformer training. First, we s...
Recent works have shown promising results of prompt tuning in stimulating pre-trained language model...
The transformer architecture and variants presented remarkable success across many machine learning ...
The current modus operandi in adapting pre-trained models involves updating all the backbone paramet...
Transformer networks have seen great success in natural language processing and machine vision, wher...
Transformer models cannot easily scale to long sequences due to their O(N^2) time and space complexi...
Prompting has shown impressive success in enabling large pretrained language models (LMs) to perform...