Teaching new information to pre-trained large language models (PLM) is a crucial but challenging task. Model adaptation techniques, such as fine-tuning and parameter-efficient training, are often prone to catastrophic forgetting, and most existing benchmarks focus on task adaptation rather than acquiring new information. This work studies and quantifies how PLM may learn and remember new world knowledge facts that do not occur in their pre-training corpus, which only contains world knowledge up to a certain date. To that purpose, we first propose NOVEL-WD, a new dataset consisting of sentences containing novel facts extracted from recent Wikidata updates, along with two evaluation tasks in the form of causal language modeling and multiple c...
Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabl...
Soft prompts have been recently proposed as a tool for adapting large frozen language models (LMs) t...
Recent advances in large-scale pre-training provide large models with the potential to learn knowled...
Combining structured information with language models is a standing problem in NLP. Building on prev...
Large Language Models (LMs) are known to encode world knowledge in their parameters as they pretrain...
Despite advances in deep learning and knowledge graphs (KGs), using language models for natural lang...
Neural language models have drastically changed the landscape of natural language processing (NLP). ...
The performance of a machine learning model trained on labeled data of a (source) domain degrades se...
Large language models can store extensive world knowledge, often extractable through question-answer...
Representational spaces learned via language modeling are fundamental to Natural Language Processing...
Large language models (LLMs) have shown incredible performance in completing various real-world task...
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Pre-trained models learn informative representations on large-scale training data through a self-sup...
Despite their wide adoption, the underlying training and memorization dynamics of very large languag...
Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabl...
Soft prompts have been recently proposed as a tool for adapting large frozen language models (LMs) t...
Recent advances in large-scale pre-training provide large models with the potential to learn knowled...
Combining structured information with language models is a standing problem in NLP. Building on prev...
Large Language Models (LMs) are known to encode world knowledge in their parameters as they pretrain...
Despite advances in deep learning and knowledge graphs (KGs), using language models for natural lang...
Neural language models have drastically changed the landscape of natural language processing (NLP). ...
The performance of a machine learning model trained on labeled data of a (source) domain degrades se...
Large language models can store extensive world knowledge, often extractable through question-answer...
Representational spaces learned via language modeling are fundamental to Natural Language Processing...
Large language models (LLMs) have shown incredible performance in completing various real-world task...
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples...
Thesis (Ph.D.)--University of Washington, 2022A robust language processing machine should be able to...
Pre-trained models learn informative representations on large-scale training data through a self-sup...
Despite their wide adoption, the underlying training and memorization dynamics of very large languag...
Humans possess a remarkable ability to assign novel interpretations to linguistic expressions, enabl...
Soft prompts have been recently proposed as a tool for adapting large frozen language models (LMs) t...
Recent advances in large-scale pre-training provide large models with the potential to learn knowled...