Eliciting Transferability in Multi-task Learning with Task-level Mixture-of-Experts

Ye, Qinyuan
Zha, Juan
Ren, Xiang

Publication date

May 2022

Abstract

Recent work suggests that transformer models are capable of multi-task learning on diverse NLP tasks. However, the potential of these models may be limited as they use the same set of parameters for all tasks. In contrast, humans tackle tasks in a more flexible way, by making proper presumptions on what skills and knowledge are relevant and executing only the necessary computations. Inspired by this, we propose to use task-level mixture-of-expert models, which has a collection of transformer layers (i.e., experts) and a router component to choose among these experts dynamically and flexibly. We show that the learned routing decisions and experts partially rediscover human categorization of NLP tasks -- certain experts are strongly associate...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Eliciting Transferability in Multi-task Learning with Task-level Mixture-of-Experts

Abstract

Extracted data

Eliciting Transferability in Multi-task Learning with Task-level Mixture-of-Experts

Abstract

Extracted data

Related items

Related items