Prior work on controllable text generation has focused on learning how to control language models through trainable decoding, smart-prompt design, or fine-tuning based on a desired objective. We hypothesize that the information needed to steer the model to generate a target sentence is already encoded within the model. Accordingly, we explore a different approach altogether: extracting latent vectors directly from pretrained language model decoders without fine-tuning. Experiments show that there exist steering vectors, which, when added to the hidden states of the language model, generate a target sentence nearly perfectly (> 99 BLEU) for English sentences from a variety of domains. We show that vector arithmetic can be used for unsupervis...
Text Generation aims to produce plausible and readable text in a human language from input data. The...
Obtaining meaning-rich representations of social media inputs, such as Tweets (unstructured and nois...
Recent studies show that auto-encoder based approaches successfully perform language generation, smo...
In recent years, the field of language modelling has witnessed exciting developments. Especially, th...
Large-scale neural language models have made impressive strides in natural language generation. Howe...
Reliably controlling the behavior of large language models is a pressing open problem. Existing meth...
Recent works have shown promising results of prompt tuning in stimulating pre-trained language model...
International audienceLarge language models (LM) based on Transformers allow to generate plausible l...
The Hidden Vector State (HVS) Model is an extension of the basic discrete Markov model in which cont...
Pretrained language models are expected to effectively map input text to a set of vectors while pres...
We describe an approach for unsupervised learning of a generic, distributed sen-tence encoder. Using...
International audienceLarge pre-trained language models (LM) based on Transformers allow to generate...
Recently, discrete latent variable models have received a surge of interest in both Natural Language...
Large pre-trained language models have repeatedly shown their ability to produce fluent text. Yet ev...
Thesis (Ph.D.)--University of Washington, 2023Language models (LMs) are at the core of almost all st...
Text Generation aims to produce plausible and readable text in a human language from input data. The...
Obtaining meaning-rich representations of social media inputs, such as Tweets (unstructured and nois...
Recent studies show that auto-encoder based approaches successfully perform language generation, smo...
In recent years, the field of language modelling has witnessed exciting developments. Especially, th...
Large-scale neural language models have made impressive strides in natural language generation. Howe...
Reliably controlling the behavior of large language models is a pressing open problem. Existing meth...
Recent works have shown promising results of prompt tuning in stimulating pre-trained language model...
International audienceLarge language models (LM) based on Transformers allow to generate plausible l...
The Hidden Vector State (HVS) Model is an extension of the basic discrete Markov model in which cont...
Pretrained language models are expected to effectively map input text to a set of vectors while pres...
We describe an approach for unsupervised learning of a generic, distributed sen-tence encoder. Using...
International audienceLarge pre-trained language models (LM) based on Transformers allow to generate...
Recently, discrete latent variable models have received a surge of interest in both Natural Language...
Large pre-trained language models have repeatedly shown their ability to produce fluent text. Yet ev...
Thesis (Ph.D.)--University of Washington, 2023Language models (LMs) are at the core of almost all st...
Text Generation aims to produce plausible and readable text in a human language from input data. The...
Obtaining meaning-rich representations of social media inputs, such as Tweets (unstructured and nois...
Recent studies show that auto-encoder based approaches successfully perform language generation, smo...