Transformer models have achieved promising results on natural language processing (NLP) tasks including extractive question answering (QA). Common Transformer encoders used in NLP tasks process the hidden states of all input tokens in the context paragraph throughout all layers. However, different from other tasks such as sequence classification, answering the raised question does not necessarily need all the tokens in the context paragraph. Following this motivation, we propose Block-skim, which learns to skim unnecessary context in higher hidden layers to improve and accelerate the Transformer performance. The key idea of Block-Skim is to identify the context that must be further processed and those that could be safely discarded early on...
Sparse Transformers have surpassed Graph Neural Networks (GNNs) as the state-of-the-art architecture...
A long-term ambition of information seeking QA systems is to reason over multi-modal contexts and ge...
In Natural Language Processing (NLP), Automatic Question Generation (AQG) is an important task that ...
We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural ...
Given a large Transformer model, how can we obtain a small and computationally efficient model which...
Given a large Transformer model, how can we obtain a small and computationally efficient model which...
Transformer models cannot easily scale to long sequences due to their O(N^2) time and space complexi...
An important task for designing QA systems is answer sentence selection (AS2): selecting the sentenc...
The goal of this article is to develop a multiple-choice questions generation system that has a numb...
Transformer models, trained and publicly released over the last couple of years, have proved effecti...
Transformers are powerful for sequence modeling. Nearly all state-of-the-art language models and pre...
Large transformer models can highly improve Answer Sentence Selection (AS2) tasks, but their high co...
State space models (SSMs) have shown impressive results on tasks that require modeling long-range de...
Theoretical thesis.Bibliography: pages 49-571 Introduction -- 2 Background and literature review -- ...
Retrieval augmented language models have recently become the standard for knowledge intensive tasks....
Sparse Transformers have surpassed Graph Neural Networks (GNNs) as the state-of-the-art architecture...
A long-term ambition of information seeking QA systems is to reason over multi-modal contexts and ge...
In Natural Language Processing (NLP), Automatic Question Generation (AQG) is an important task that ...
We propose TandA, an effective technique for fine-tuning pre-trained Transformer models for natural ...
Given a large Transformer model, how can we obtain a small and computationally efficient model which...
Given a large Transformer model, how can we obtain a small and computationally efficient model which...
Transformer models cannot easily scale to long sequences due to their O(N^2) time and space complexi...
An important task for designing QA systems is answer sentence selection (AS2): selecting the sentenc...
The goal of this article is to develop a multiple-choice questions generation system that has a numb...
Transformer models, trained and publicly released over the last couple of years, have proved effecti...
Transformers are powerful for sequence modeling. Nearly all state-of-the-art language models and pre...
Large transformer models can highly improve Answer Sentence Selection (AS2) tasks, but their high co...
State space models (SSMs) have shown impressive results on tasks that require modeling long-range de...
Theoretical thesis.Bibliography: pages 49-571 Introduction -- 2 Background and literature review -- ...
Retrieval augmented language models have recently become the standard for knowledge intensive tasks....
Sparse Transformers have surpassed Graph Neural Networks (GNNs) as the state-of-the-art architecture...
A long-term ambition of information seeking QA systems is to reason over multi-modal contexts and ge...
In Natural Language Processing (NLP), Automatic Question Generation (AQG) is an important task that ...