Modern large-scale Pre-trained Language Models (PLMs) have achieved tremendous success on a wide range of downstream tasks. However, most of the LM pre-training objectives only focus on text reconstruction, but have not sought to learn latent-level interpretable representations of sentences. In this paper, we manage to push the language models to obtain a deeper understanding of sentences by proposing a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types. Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge. Besides, the language model pre-train...
State-of-the-art pre-trained language models have been shown to memorise facts and per- form well wi...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
International audienceStructuring the latent space in probabilistic deep generative models, e.g., va...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Pre-trained Language Models (PLMs) have achieved great success in various Natural Language Processin...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...
Existing pre-trained models are generally geared towards a particular class of problems. To date, th...
Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained language mode...
Recently, the development of pre-trained language models has brought natural language processing (NL...
To obtain high-quality sentence embeddings from pretrained language models (PLMs), they must either ...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...
With the dramatically increased number of parameters in language models, sparsity methods have recei...
Large Language Models have become the core architecture upon which most modern natural language proc...
In recent years, the community of natural language processing (NLP) has seen amazing progress in the...
State-of-the-art pre-trained language models have been shown to memorise facts and per- form well wi...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
International audienceStructuring the latent space in probabilistic deep generative models, e.g., va...
The pre-training and fine-tuning paradigm has contributed to a number of breakthroughs in Natural La...
Pre-trained Language Models (PLMs) have achieved great success in various Natural Language Processin...
Pretrained language models (PLMs) have demonstrated remarkable performance in various natural langua...
Existing pre-trained models are generally geared towards a particular class of problems. To date, th...
Some NLP tasks can be solved in a fully unsupervised fashion by providing a pretrained language mode...
Recently, the development of pre-trained language models has brought natural language processing (NL...
To obtain high-quality sentence embeddings from pretrained language models (PLMs), they must either ...
Natural language processing (NLP) techniques had significantly improved by introducing pre-trained l...
Pretraining deep neural networks to perform language modeling - that is, to reconstruct missing word...
With the dramatically increased number of parameters in language models, sparsity methods have recei...
Large Language Models have become the core architecture upon which most modern natural language proc...
In recent years, the community of natural language processing (NLP) has seen amazing progress in the...
State-of-the-art pre-trained language models have been shown to memorise facts and per- form well wi...
Gigantic pre-trained models have become central to natural language processing (NLP), serving as the...
International audienceStructuring the latent space in probabilistic deep generative models, e.g., va...