The attention mechanism is the key to many state-of-the-art transformer-based models in Natural Language Processing and Computer Vision. These models are pretrained on large datasets and the model size is growing rapidly. At the same time, the computation and data movement cost and the on-chip memory demand is also growing beyond the capabilities of edge devices. This thesis provides solutions to address these challenges by developing strategies to prune the inconsequential attention scores efficiently and effectively. Attention score is the core of the atten- tion mechanism in all transformer-based models. It measures the correlation of two tokens in a sequence. Low attention score value indicates unimportant correlation and minimal impact...
The Transformer architecture is ubiquitously used as the building block of large-scale autoregressiv...
Transformers are the state-of-the-art for machine translation and grammar error correction. One of t...
Attention mechanism takes a crucial role among the key technologies in transformer-based visual trac...
The attention mechanism is the key to many state-of-the-art transformer-based models in Natural Lang...
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-at...
Transformer-based models excel in speech recognition. Existing efforts to optimize Transformer infer...
Recent years have seen the vast potential of the Transformer model, as it is arguably the first gene...
© 2021 IEEE.The self-attention mechanism is rapidly emerging as one of the most important key primit...
The attention mechanism is considered the backbone of the widely-used Transformer architecture. It c...
The study of specialized accelerators tailored for neural networks is becoming a promising topic in ...
Transformer trackers have achieved impressive advancements recently, where the attention mechanism p...
As the key component in Transformer models, attention mechanism has shown its great power in learnin...
Attention mechanism has become the dominant module in natural language processing models. It is comp...
In this paper, we propose that the dot product pairwise matching attention layer, which is widely us...
The quadratic computation complexity of self-attention has been a persistent challenge when applying...
The Transformer architecture is ubiquitously used as the building block of large-scale autoregressiv...
Transformers are the state-of-the-art for machine translation and grammar error correction. One of t...
Attention mechanism takes a crucial role among the key technologies in transformer-based visual trac...
The attention mechanism is the key to many state-of-the-art transformer-based models in Natural Lang...
We introduce Performers, Transformer architectures which can estimate regular (softmax) full-rank-at...
Transformer-based models excel in speech recognition. Existing efforts to optimize Transformer infer...
Recent years have seen the vast potential of the Transformer model, as it is arguably the first gene...
© 2021 IEEE.The self-attention mechanism is rapidly emerging as one of the most important key primit...
The attention mechanism is considered the backbone of the widely-used Transformer architecture. It c...
The study of specialized accelerators tailored for neural networks is becoming a promising topic in ...
Transformer trackers have achieved impressive advancements recently, where the attention mechanism p...
As the key component in Transformer models, attention mechanism has shown its great power in learnin...
Attention mechanism has become the dominant module in natural language processing models. It is comp...
In this paper, we propose that the dot product pairwise matching attention layer, which is widely us...
The quadratic computation complexity of self-attention has been a persistent challenge when applying...
The Transformer architecture is ubiquitously used as the building block of large-scale autoregressiv...
Transformers are the state-of-the-art for machine translation and grammar error correction. One of t...
Attention mechanism takes a crucial role among the key technologies in transformer-based visual trac...