Transformer language models (LMs) have been shown to represent concepts as directions in the latent space of hidden activations. However, for any given human-interpretable concept, how can we find its direction in the latent space? We present a technique called linear relational concepts (LRC) for finding concept directions corresponding to human-interpretable concepts at a given hidden layer in a transformer LM by first modeling the relation between subject and object as a linear relational embedding (LRE). While the LRE work was mainly presented as an exercise in understanding model representations, we find that inverting the LRE while using earlier object layers results in a powerful technique to find concept directions that both work we...
We present a framework for learning abstract relational knowledge with the aim of explaining how peo...
Transformer based language models exhibit intelligent behaviors such as understanding natural langua...
Linear Relational Embedding is a method of learning a distributed representation of concepts from da...
A primary criticism towards language models (LMs) is their inscrutability. This paper presents evide...
In this paper we discuss methods for general-izing over relational data. Our approach is to learn di...
alberto,hinton¡ We present Linear Relational Embedding (LRE), a new method of learning a distributed...
Computational models of verbal analogy and relational similarity judgments can employ different type...
The widespread usage of latent language representations via pre-trained language models (LMs) sugges...
One of the most remarkable properties of word embeddings is the fact that they capture certain types...
In this paper we introduce Linear Relational Embedding as a means of learning a distributed represe...
International audienceRelational datasets, i.e., datasets in which individuals are described both by...
This paper introduces Latent Relational Analysis (LRA), a method for measuring semantic similarity. ...
Despite the recent success of large pretrained language models (LMs) on a variety of prompting tasks...
Large language models show human-like performance in knowledge extraction, reasoning and dialogue, b...
Many AI researchers and cognitive scientists have argued that analogy is the core of cognition. The ...
We present a framework for learning abstract relational knowledge with the aim of explaining how peo...
Transformer based language models exhibit intelligent behaviors such as understanding natural langua...
Linear Relational Embedding is a method of learning a distributed representation of concepts from da...
A primary criticism towards language models (LMs) is their inscrutability. This paper presents evide...
In this paper we discuss methods for general-izing over relational data. Our approach is to learn di...
alberto,hinton¡ We present Linear Relational Embedding (LRE), a new method of learning a distributed...
Computational models of verbal analogy and relational similarity judgments can employ different type...
The widespread usage of latent language representations via pre-trained language models (LMs) sugges...
One of the most remarkable properties of word embeddings is the fact that they capture certain types...
In this paper we introduce Linear Relational Embedding as a means of learning a distributed represe...
International audienceRelational datasets, i.e., datasets in which individuals are described both by...
This paper introduces Latent Relational Analysis (LRA), a method for measuring semantic similarity. ...
Despite the recent success of large pretrained language models (LMs) on a variety of prompting tasks...
Large language models show human-like performance in knowledge extraction, reasoning and dialogue, b...
Many AI researchers and cognitive scientists have argued that analogy is the core of cognition. The ...
We present a framework for learning abstract relational knowledge with the aim of explaining how peo...
Transformer based language models exhibit intelligent behaviors such as understanding natural langua...
Linear Relational Embedding is a method of learning a distributed representation of concepts from da...