Task-conditional architecture offers advantage in parameter efficiency but falls short in performance compared to state-of-the-art multi-decoder methods. How to trade off performance and model parameters is an important and difficult problem. In this paper, we introduce a simple and lightweight task-conditional model called Prompt Guided Transformer (PGT) to optimize this challenge. Our approach designs a Prompt-conditioned Transformer block, which incorporates task-specific prompts in the self-attention mechanism to achieve global dependency modeling and parameter-efficient feature adaptation across multiple tasks. This block is integrated into both the shared encoder and decoder, enhancing the capture of intra- and inter-task features. Mo...
The Transformer model is a very recent, fast and powerful discovery in neural machine translation. W...
Transformer-based neural models are used in many AI applications. Training these models is expensive...
Recently, large-scale transformer-based models have been proven to be effective over various tasks a...
Task-conditional architecture offers advantage in parameter efficiency but falls short in performanc...
Previous multi-task dense prediction studies developed complex pipelines such as multi-modal distill...
Convolution neural networks (CNNs) and Transformers have their own advantages and both have been wid...
Multi-head attention is a driving force behind state-of-the-art transformers, which achieve remarkab...
Structure prediction (SP) tasks are important in natural language understanding in the sense that th...
Adapting large-scale pretrained models to various downstream tasks via fine-tuning is a standard met...
We introduce the first multitasking vision transformer adapters that learn generalizable task affini...
In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chine...
Learning discriminative task-specific features simultaneously for multiple distinct tasks is a funda...
Given a large Transformer model, how can we obtain a small and computationally efficient model which...
More transformer blocks with residual connections have recently achieved impressive results on vario...
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-at...
The Transformer model is a very recent, fast and powerful discovery in neural machine translation. W...
Transformer-based neural models are used in many AI applications. Training these models is expensive...
Recently, large-scale transformer-based models have been proven to be effective over various tasks a...
Task-conditional architecture offers advantage in parameter efficiency but falls short in performanc...
Previous multi-task dense prediction studies developed complex pipelines such as multi-modal distill...
Convolution neural networks (CNNs) and Transformers have their own advantages and both have been wid...
Multi-head attention is a driving force behind state-of-the-art transformers, which achieve remarkab...
Structure prediction (SP) tasks are important in natural language understanding in the sense that th...
Adapting large-scale pretrained models to various downstream tasks via fine-tuning is a standard met...
We introduce the first multitasking vision transformer adapters that learn generalizable task affini...
In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chine...
Learning discriminative task-specific features simultaneously for multiple distinct tasks is a funda...
Given a large Transformer model, how can we obtain a small and computationally efficient model which...
More transformer blocks with residual connections have recently achieved impressive results on vario...
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-at...
The Transformer model is a very recent, fast and powerful discovery in neural machine translation. W...
Transformer-based neural models are used in many AI applications. Training these models is expensive...
Recently, large-scale transformer-based models have been proven to be effective over various tasks a...