Evaluating the output of NLG systems is notoriously difficult, and performing assessments of text quality even more so. A range of automated and subject-based approaches to the evaluation of text quality have been taken, including comparison with a putative gold standard text, analysis of specific linguistic features of the output, expert review and task-based evaluation. In this paper we present the results of a variety of such approaches in the context of a case study application. We discuss the problems encountered in the implementation of each approach in the context of the literature, and propose that a test based on the Turing test for machine intelligence offers a way forward in the evaluation of the subjective notion of text quality
Any scientific endeavour must be evaluated in order to assess its correctness. In many applied scien...
Text Generation is a pressing topic of Natural Language Processing that involves the prediction of u...
There is growing interest in using automatically computed corpus-based evaluation metrics to evaluat...
Evaluating the output of NLG systems is notoriously difficult, and performing assessments of text qu...
Automatic methods and metrics that assess various quality criteria of automatically generated texts ...
Zarrieß S, Loth S, Schlangen D. Reading Times Predict the Quality of Generated Text Above and Beyond...
Currently, there is little agreement as to how Natural Language Generation (NLG) systems should be e...
We consider the evaluation problem in Natural Language Generation (NLG) and present results for eval...
An important question in the evaluation of Natu-ral Language Generation systems concerns the re-lati...
International audienceAn important question in the evaluation of Natural Language Generation systems...
The research field of Natural Language Generation offers practitioners a wide range of techniques fo...
We report on various approaches to automatic evaluation of machine translation quality and describe ...
The progress in Natural Language Generation (NLG) has resulted in the widespread use of artificial t...
A major challenge in the field of Text Generation is evaluation, because we lack a sound theory that...
We present a very simple model for text quality assessment based on a deep convolutional neural netw...
Any scientific endeavour must be evaluated in order to assess its correctness. In many applied scien...
Text Generation is a pressing topic of Natural Language Processing that involves the prediction of u...
There is growing interest in using automatically computed corpus-based evaluation metrics to evaluat...
Evaluating the output of NLG systems is notoriously difficult, and performing assessments of text qu...
Automatic methods and metrics that assess various quality criteria of automatically generated texts ...
Zarrieß S, Loth S, Schlangen D. Reading Times Predict the Quality of Generated Text Above and Beyond...
Currently, there is little agreement as to how Natural Language Generation (NLG) systems should be e...
We consider the evaluation problem in Natural Language Generation (NLG) and present results for eval...
An important question in the evaluation of Natu-ral Language Generation systems concerns the re-lati...
International audienceAn important question in the evaluation of Natural Language Generation systems...
The research field of Natural Language Generation offers practitioners a wide range of techniques fo...
We report on various approaches to automatic evaluation of machine translation quality and describe ...
The progress in Natural Language Generation (NLG) has resulted in the widespread use of artificial t...
A major challenge in the field of Text Generation is evaluation, because we lack a sound theory that...
We present a very simple model for text quality assessment based on a deep convolutional neural netw...
Any scientific endeavour must be evaluated in order to assess its correctness. In many applied scien...
Text Generation is a pressing topic of Natural Language Processing that involves the prediction of u...
There is growing interest in using automatically computed corpus-based evaluation metrics to evaluat...