The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives

Voita, Elena
Sennrich, Rico
Titov, Ivan

Open PDF

Open link

Publication date

November 2019

DOI

10.5167/uzh-176332

Publisher

Association for Computational Linguistics

Abstract

We seek to understand how the representations of individual tokens and the structure of the learned feature space evolve between layers in deep neural networks under different learning objectives. We chose the Transformers for our analysis as they have been shown effective with various tasks, including machine translation (MT), standard left-to-right language models (LM) and masked language modeling (MLM). Previous work used black-box probing tasks to show that the representations learned by the Transformer differ significantly depending on the objective. In this work, we use canonical correlation analysis and mutual information estimators to study how information flows across Transformer layers and observe that the choice of the objective ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives

Abstract

Extracted data

The Bottom-up Evolution of Representations in the Transformer: A Study with Machine Translation and Language Modeling Objectives

Abstract

Extracted data

Related items

Related items