Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Li, Yuxuan
McClelland, James L.

Publication date

December 2022

Language

English

Abstract

Transformer networks have seen great success in natural language processing and machine vision, where task objectives such as next word prediction and image classification benefit from nuanced context sensitivity across high-dimensional inputs. However, there is an ongoing debate about how and when transformers can acquire highly structured behavior and achieve systematic generalization. Here, we explore how well a causal transformer can perform a set of algorithmic tasks, including copying, sorting, and hierarchical compositions of these operations. We demonstrate strong generalization to sequences longer than those used in training by replacing the standard positional encoding typically used in transformers with labels arbitrarily paired ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Abstract

Extracted data

Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks

Abstract

Extracted data

Related items

Related items