Recurrent Memory Transformer

Bulatov, Aydar
Kuratov, Yuri
Burtsev, Mikhail S.

Publication date

July 2022

Language

English

Abstract

Transformer-based models show their effectiveness across multiple domains and tasks. The self-attention allows to combine information from all sequence elements into context-aware representations. However, global and local information has to be stored mostly in the same element-wise representations. Moreover, the length of an input sequence is limited by quadratic computational complexity of self-attention. In this work, we propose and study a memory-augmented segment-level recurrent Transformer (Recurrent Memory Transformer). Memory allows to store and process local and global information as well as to pass information between segments of the long sequence with the help of recurrence. We implement a memory mechanism with no changes to Tr...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Recurrent Memory Transformer

Abstract

Extracted data

Recurrent Memory Transformer

Abstract

Extracted data

Related items

Related items