Human variation in content selection in summarization has given rise to some fundamental research questions: How can one incorporate the observed variation in suitable evaluation measures? How can such measures reflect the fact that summaries conveying different content can be equally good and informative? In this article, we address these very questions by proposing a method for analysis of multiple human abstracts into semantic content units. Such analysis allows us not only to quantify human variation in content selection, but also to assign empirical importance weight to different content units. It serves as the basis for an evaluation method, the Pyramid Method, that incorporates the observed variation and is predictive of different eq...
A difficulty in the design of automated text summarization algorithms is in the objective evaluat...
Evaluating the selection of content in a summary is important both for human-written summaries, whic...
Annotation projects dealing with complex semantic or pragmatic phenomena face the dilemma of creatin...
Human variation in content selection in summarization has given rise to some fundamental research qu...
Human variation in content selection in summarization has given rise to some fundamental research qu...
Human variation in content selection in summarization has given rise to some fundamental re-search q...
We present an empirically grounded method for evaluating content selection in summarization. It inco...
From the outset of automated generation of summaries, the difficulty of evaluation has been widely d...
In DUC 2005, the pyramid method for content evaluation was used for the first time in a crosssite ev...
We present a fully automatic method for content selection evaluation in summarization that does not ...
We present a fully automatic method for content selection evaluation in summarization that does not ...
The manual Pyramid method for summary evaluation, which focuses on the task of determining if a summ...
The pyramid evaluation effort for the 2006 Document Understanding Conference involved twenty-two sit...
From the outset of automated generation of summaries, the diÆculty of eval-uation has been widely di...
A pyramid evaluation dataset was created for DUC 2003 in order to compare results with DUC 2005, and...
A difficulty in the design of automated text summarization algorithms is in the objective evaluat...
Evaluating the selection of content in a summary is important both for human-written summaries, whic...
Annotation projects dealing with complex semantic or pragmatic phenomena face the dilemma of creatin...
Human variation in content selection in summarization has given rise to some fundamental research qu...
Human variation in content selection in summarization has given rise to some fundamental research qu...
Human variation in content selection in summarization has given rise to some fundamental re-search q...
We present an empirically grounded method for evaluating content selection in summarization. It inco...
From the outset of automated generation of summaries, the difficulty of evaluation has been widely d...
In DUC 2005, the pyramid method for content evaluation was used for the first time in a crosssite ev...
We present a fully automatic method for content selection evaluation in summarization that does not ...
We present a fully automatic method for content selection evaluation in summarization that does not ...
The manual Pyramid method for summary evaluation, which focuses on the task of determining if a summ...
The pyramid evaluation effort for the 2006 Document Understanding Conference involved twenty-two sit...
From the outset of automated generation of summaries, the diÆculty of eval-uation has been widely di...
A pyramid evaluation dataset was created for DUC 2003 in order to compare results with DUC 2005, and...
A difficulty in the design of automated text summarization algorithms is in the objective evaluat...
Evaluating the selection of content in a summary is important both for human-written summaries, whic...
Annotation projects dealing with complex semantic or pragmatic phenomena face the dilemma of creatin...