Abstract Recent advances in large language models (LLMs) have demonstrated remarkable successes in zero- and few-shot performance on various downstream tasks, paving the way for applications in high-stakes domains. In this study, we systematically examine the capabilities and limitations of LLMs, specifically GPT-3.5 and ChatGPT, in performing zero-shot medical evidence summarization across six clinical domains. We conduct both automatic and human evaluations, covering several dimensions of summary quality. Our study demonstrates that automatic metrics often do not strongly correlate with the quality of summaries. Furthermore, informed by our human evaluations, we define a terminology of error types for medical evidence summarization. Our f...
We present LQVSumm, a corpus of about 2000 automatically created extractive multi-document summaries...
AbstractObjectiveThe amount of information for clinicians and clinical researchers is growing expone...
We consider the problem of learning to simplify medical texts. This is important because most reliab...
Large language models (LLMs) have been applied to tasks in healthcare, ranging from medical exam que...
We show that large language models, such as GPT-3, perform well at zero-shot information extraction ...
The recent focus on Large Language Models (LLMs) has yielded unprecedented discussion of their poten...
The goal of automated summarization techniques (Paice, 1990; Kupiec et al, 1995) is to condense text...
Large Language Models (LLMs) like the GPT and LLaMA families have demonstrated exceptional capabilit...
We present an empirical evaluation of various outputs generated by nine of the most widely-available...
Abstract There is an increasing interest in developing artificial intelligence (AI) systems to proce...
Artificial intelligence (AI)-based language models, such as ChatGPT offer an enormous potential for ...
Text summarization is a critical Natural Language Processing (NLP) task with applications ranging fr...
Large language models (LLMs)—machine learning algorithms that can recognize, summarize, translate,...
While large language models (LLMs) already achieve strong performance on standard generic summarizat...
Topic models help make sense of large text collections. Automatically evaluating their output and de...
We present LQVSumm, a corpus of about 2000 automatically created extractive multi-document summaries...
AbstractObjectiveThe amount of information for clinicians and clinical researchers is growing expone...
We consider the problem of learning to simplify medical texts. This is important because most reliab...
Large language models (LLMs) have been applied to tasks in healthcare, ranging from medical exam que...
We show that large language models, such as GPT-3, perform well at zero-shot information extraction ...
The recent focus on Large Language Models (LLMs) has yielded unprecedented discussion of their poten...
The goal of automated summarization techniques (Paice, 1990; Kupiec et al, 1995) is to condense text...
Large Language Models (LLMs) like the GPT and LLaMA families have demonstrated exceptional capabilit...
We present an empirical evaluation of various outputs generated by nine of the most widely-available...
Abstract There is an increasing interest in developing artificial intelligence (AI) systems to proce...
Artificial intelligence (AI)-based language models, such as ChatGPT offer an enormous potential for ...
Text summarization is a critical Natural Language Processing (NLP) task with applications ranging fr...
Large language models (LLMs)—machine learning algorithms that can recognize, summarize, translate,...
While large language models (LLMs) already achieve strong performance on standard generic summarizat...
Topic models help make sense of large text collections. Automatically evaluating their output and de...
We present LQVSumm, a corpus of about 2000 automatically created extractive multi-document summaries...
AbstractObjectiveThe amount of information for clinicians and clinical researchers is growing expone...
We consider the problem of learning to simplify medical texts. This is important because most reliab...