Modeling the Multi-mode Distribution in Self-Supervised Language Models

Chang, Haw-Shiuan

Open PDF

Open link

Publication date

October 2022

Publisher

ScholarWorks@UMass Amherst

Language

English

Abstract

Self-supervised large language models (LMs) have become a highly-influential and foundational tool for many NLP models. For this reason, their expressivity is an important topic of study. In near-universal practice, given the language context, the model predicts a word from the vocabulary using a single embedded vector representation of both context and dictionary entries. Note that the context sometimes implies that the distribution over predicted words should be multi-modal in embedded space. However, the context’s single-vector representation provably fails to capture such a distribution. To address this limitation, we propose to represent context with multiple vector embeddings, which we term facets. This is distinct from previous work ...

Extracted data

We use cookies to provide a better user experience.

Data Protection

Modeling the Multi-mode Distribution in Self-Supervised Language Models

Abstract

Extracted data

Modeling the Multi-mode Distribution in Self-Supervised Language Models

Abstract

Extracted data

Related items

Related items