The fluency and creativity of large pre-trained language models (LLMs) have led to their widespread use, sometimes even as a replacement for traditional search engines. Yet language models are prone to making convincing but factually inaccurate claims, often referred to as 'hallucinations.' These errors can inadvertently spread misinformation or harmfully perpetuate misconceptions. Further, manual fact-checking of model responses is a time-consuming process, making human factuality labels expensive to acquire. In this work, we fine-tune language models to be more factual, without human labeling and targeting more open-ended generation settings than past work. We leverage two key recent innovations in NLP to do so. First, several recent work...
Semantic consistency of a language model is broadly defined as the model's ability to produce semant...
We call into question the recently popularized method of direct model editing as a means of correcti...
Grounded text generation systems often generate text that contains factual inconsistencies, hinderin...
Large Language Models (LLMs) make natural interfaces to factual knowledge, but their usefulness is l...
We propose a benchmark to measure whether a language model is truthful in generating answers to ques...
Large Language Models (LLMs) are increasingly used for accessing information on the web. Their truth...
Automatic fact-checking plays a crucial role in combating the spread of misinformation. Large Langua...
The increased deployment of LMs for real-world tasks involving knowledge and facts makes it importan...
Recent progress in pre-trained language models led to systems that are able to generate text of an i...
Building on Petroni et al. (2019), we propose two new probing tasks analyzing factual knowledge stor...
Prior research has shown that typical fact-checking models for stand-alone claims struggle with clai...
The development of trustworthy conversational information-seeking systems relies on dialogue models ...
Large Language Models (LLMs), such as ChatGPT/GPT-4, have garnered widespread attention owing to the...
Language model fine-tuning is essential for modern natural language processing, but is computational...
We study whether language models can evaluate the validity of their own claims and predict which que...
Semantic consistency of a language model is broadly defined as the model's ability to produce semant...
We call into question the recently popularized method of direct model editing as a means of correcti...
Grounded text generation systems often generate text that contains factual inconsistencies, hinderin...
Large Language Models (LLMs) make natural interfaces to factual knowledge, but their usefulness is l...
We propose a benchmark to measure whether a language model is truthful in generating answers to ques...
Large Language Models (LLMs) are increasingly used for accessing information on the web. Their truth...
Automatic fact-checking plays a crucial role in combating the spread of misinformation. Large Langua...
The increased deployment of LMs for real-world tasks involving knowledge and facts makes it importan...
Recent progress in pre-trained language models led to systems that are able to generate text of an i...
Building on Petroni et al. (2019), we propose two new probing tasks analyzing factual knowledge stor...
Prior research has shown that typical fact-checking models for stand-alone claims struggle with clai...
The development of trustworthy conversational information-seeking systems relies on dialogue models ...
Large Language Models (LLMs), such as ChatGPT/GPT-4, have garnered widespread attention owing to the...
Language model fine-tuning is essential for modern natural language processing, but is computational...
We study whether language models can evaluate the validity of their own claims and predict which que...
Semantic consistency of a language model is broadly defined as the model's ability to produce semant...
We call into question the recently popularized method of direct model editing as a means of correcti...
Grounded text generation systems often generate text that contains factual inconsistencies, hinderin...