In multi-document summarization (MDS), the input is a cluster of documents, and the output is the cluster summary. In this paper, we focus on pretraining objectives for MDS. Specifically, we introduce a simple pretraining objective of choosing the ROUGE-based centroid of each document cluster as a proxy for its summary. Our objective thus does not require human written summaries and can be used for pretraining on a dataset containing only clusters of documents. Through zero-shot and fully supervised experiments on multiple MDS datasets, we show that our model Centrum is better or comparable to a state-of-the-art model. We release our pretrained and finetuned models at https://github.com/ratishsp/centrum.Comment: 4 pages, work-in-progres
Automatic summarization has advanced greatly in the past few decades. However, there remains a huge ...
Obtaining large-scale and high-quality training data for multi-document summarization (MDS) tasks is...
Automatic summarization has advanced greatly in the past few decades. However, there remains a huge ...
We present a multi-document summarizer, MEAD, which generates summaries using cluster centroids prod...
We present a multi-document summarizer, called MEAD, which generates summaries using cluster centr...
We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroi...
In Natural Language Processing, multi-document summarization (MDS) poses many challenges to research...
Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a...
In this thesis, we have approached a technique for tackling abstractive text summarization tasks wi...
In Natural Language Processing, multi-document summarization (MDS) poses many challenges to research...
In this paper, we explore the use of automatic syntactic simplification for improving content select...
In this paper, we explore the use of automatic syntactic simplification for improving content select...
Automatic multi-document summarization (MDS) is the process of extracting the most important informa...
Text summarization aims to create a concise and fluent summary that captures the most salient inform...
Summarization is the notion of abstracting key content from information sources. The task of summari...
Automatic summarization has advanced greatly in the past few decades. However, there remains a huge ...
Obtaining large-scale and high-quality training data for multi-document summarization (MDS) tasks is...
Automatic summarization has advanced greatly in the past few decades. However, there remains a huge ...
We present a multi-document summarizer, MEAD, which generates summaries using cluster centroids prod...
We present a multi-document summarizer, called MEAD, which generates summaries using cluster centr...
We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroi...
In Natural Language Processing, multi-document summarization (MDS) poses many challenges to research...
Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a...
In this thesis, we have approached a technique for tackling abstractive text summarization tasks wi...
In Natural Language Processing, multi-document summarization (MDS) poses many challenges to research...
In this paper, we explore the use of automatic syntactic simplification for improving content select...
In this paper, we explore the use of automatic syntactic simplification for improving content select...
Automatic multi-document summarization (MDS) is the process of extracting the most important informa...
Text summarization aims to create a concise and fluent summary that captures the most salient inform...
Summarization is the notion of abstracting key content from information sources. The task of summari...
Automatic summarization has advanced greatly in the past few decades. However, there remains a huge ...
Obtaining large-scale and high-quality training data for multi-document summarization (MDS) tasks is...
Automatic summarization has advanced greatly in the past few decades. However, there remains a huge ...