Barriers and enabling factors for error analysis in NLG research

Semantic Accuracy in Natural Language Generation: A Thesis Proposal

Schmidtová, Patrícia

January 2023

With the fast-growing popularity of current large pre-trained language models (LLMs), it is necessar...

Comparing automatic and human evaluation of NLG systems

Anja Belz

January 2006

We consider the evaluation problem in Natural Language Generation (NLG) and present results for eval...

Twenty years of confusion in human evaluation: NLG needs evaluation sheets and standardised definitions

Howcroft, David
Belz, Anya
Gkatzia, Dimitra
Clinciu, Miruna
Hasan, Sadid
Mahamood, Saad
Mille, Simon
van Miltenburg, Emiel
Santhanam, Sashank
Rieser, Verena

December 2020

Human assessment remains the most trusted form of evaluation in NLG, but highly diverse approaches a...

Barriers and enabling factors for error analysis in NLG research

Van Miltenburg, Emiel
Clinciu, Miruna
Dušek, Ondřej
Gkatzia, Dimitra
Inglis, Stephanie
Leppänen, Leo
Mahamood, Saad
Schoch, Stephanie
Thomson, Craig
Wen, Luou

January 2023

Earlier research has shown that few studies in Natural Language Generation (NLG) evaluate their syst...

Underreporting of errors in NLG output, and what to do about it

van Miltenburg, Emiel
Clinciu, Miruna
Dušek, Ondřej
Gkatzia, Dimitra
Inglis, Stephanie
Leppänen, Leo
Mahamood, Saad
Manning, Emma
Schoch, Stephanie
Thomson, Craig
Wen, Luou

August 2021

We observe a severe under-reporting of the different kinds of errors that Natural Language Generatio...

A survey of recent error annotation schemes for automatically generated text

Huidrom, Rudali
Belz, Anya

December 2022

While automatically computing numerical scores remains the dominant paradigm in NLP system evaluatio...

Problems in Evaluating Grammatical Error Detection Systems

M Ar T In
C Hodorow
M Ar Kus Di C K I N Son
Ross Israe L
J Oel T E T
Reau Lt

April 2020

ABSTRACT Many evaluation issues for grammatical error detection have previously been overlooked, mak...

Acquiring correct knowledge for natural language generation

Ehud Reiter
Somayajulu G. Sripada
Roma Robertson

January 2003

Natural language generation (nlg) systems are computer software systems that pro-duce texts in Engli...

Evaluating the Evaluators: Subjective Bias and Consistency in Human Evaluation of Natural Language Generation

Amidei, Jacopo

January 2021

The Natural Language Generation (NLG) community relies on shared evaluation techniques to understand...

The use of rating and Likert scales in Natural Language Generation human evaluation tasks: A review and some recommendations

Amidei, Jacopo
Piwek, Paul
Willis, Alistair

January 2019

Rating and Likert scales are widely used in evaluation experiments to measure the quality of Natural...

Five sources of bias in natural language processing

Hovy, Dirk
Prabhumoye, Shrimai

January 2021

Recently, there has been an increased interest in demographically grounded bias in natural language ...

Reading errors made by skilled and unskilled readers: evaluating a system that generates reports for people with poor literacy

Williams, Sandra
Reiter, Ehud

August 2004

This study evaluates a natural language generation system that creates literacy assessment reports i...

Undesirable biases in NLP: Averting a crisis of measurement

van der Wal, Oskar
Bachmann, Dominik
Leidinger, Alina
van Maanen, Leendert
Zuidema, Willem
Schulz, Katrin

November 2022

As Natural Language Processing (NLP) technology rapidly develops and spreads into daily life, it bec...

Automatic Error Detection for Natural Language Generation

Michael Wayne
Goodmany Francis Bondz

December 2014

One of the advantages of deep grammars, such as those based on HPSG, is that they can be used for ge...

Two reproductions of a human-assessed comparative evaluation of a semantic error detection system

Huidrom, Rudali
Dušek, Ondřej
Kasner, Zdeněk
Castro Ferrera, Thiago
Belz, Anya

July 2022

In this paper, we present the results of two re- production studies for the human evaluation origina...

Semantic Accuracy in Natural Language Generation: A Thesis Proposal

Schmidtová, Patrícia

January 2023

With the fast-growing popularity of current large pre-trained language models (LLMs), it is necessar...

Comparing automatic and human evaluation of NLG systems

Anja Belz

January 2006

We consider the evaluation problem in Natural Language Generation (NLG) and present results for eval...

Twenty years of confusion in human evaluation: NLG needs evaluation sheets and standardised definitions

Howcroft, David
Belz, Anya
Gkatzia, Dimitra
Clinciu, Miruna
Hasan, Sadid
Mahamood, Saad
Mille, Simon
van Miltenburg, Emiel
Santhanam, Sashank
Rieser, Verena

December 2020

Human assessment remains the most trusted form of evaluation in NLG, but highly diverse approaches a...

Barriers and enabling factors for error analysis in NLG research

Van Miltenburg, Emiel
Clinciu, Miruna
Dušek, Ondřej
Gkatzia, Dimitra
Inglis, Stephanie
Leppänen, Leo
Mahamood, Saad
Schoch, Stephanie
Thomson, Craig
Wen, Luou

January 2023

Earlier research has shown that few studies in Natural Language Generation (NLG) evaluate their syst...

Underreporting of errors in NLG output, and what to do about it

van Miltenburg, Emiel
Clinciu, Miruna
Dušek, Ondřej
Gkatzia, Dimitra
Inglis, Stephanie
Leppänen, Leo
Mahamood, Saad
Manning, Emma
Schoch, Stephanie
Thomson, Craig
Wen, Luou

August 2021

We observe a severe under-reporting of the different kinds of errors that Natural Language Generatio...

A survey of recent error annotation schemes for automatically generated text

Huidrom, Rudali
Belz, Anya

December 2022

While automatically computing numerical scores remains the dominant paradigm in NLP system evaluatio...

Problems in Evaluating Grammatical Error Detection Systems

M Ar T In
C Hodorow
M Ar Kus Di C K I N Son
Ross Israe L
J Oel T E T
Reau Lt

April 2020

ABSTRACT Many evaluation issues for grammatical error detection have previously been overlooked, mak...

Acquiring correct knowledge for natural language generation

Ehud Reiter
Somayajulu G. Sripada
Roma Robertson

January 2003

Natural language generation (nlg) systems are computer software systems that pro-duce texts in Engli...

Evaluating the Evaluators: Subjective Bias and Consistency in Human Evaluation of Natural Language Generation

Amidei, Jacopo

January 2021

The Natural Language Generation (NLG) community relies on shared evaluation techniques to understand...

The use of rating and Likert scales in Natural Language Generation human evaluation tasks: A review and some recommendations

Amidei, Jacopo
Piwek, Paul
Willis, Alistair

January 2019

Rating and Likert scales are widely used in evaluation experiments to measure the quality of Natural...

Five sources of bias in natural language processing

Hovy, Dirk
Prabhumoye, Shrimai

January 2021

Recently, there has been an increased interest in demographically grounded bias in natural language ...

Reading errors made by skilled and unskilled readers: evaluating a system that generates reports for people with poor literacy

Williams, Sandra
Reiter, Ehud

August 2004

This study evaluates a natural language generation system that creates literacy assessment reports i...

Undesirable biases in NLP: Averting a crisis of measurement

van der Wal, Oskar
Bachmann, Dominik
Leidinger, Alina
van Maanen, Leendert
Zuidema, Willem
Schulz, Katrin

November 2022

As Natural Language Processing (NLP) technology rapidly develops and spreads into daily life, it bec...

Automatic Error Detection for Natural Language Generation

Michael Wayne
Goodmany Francis Bondz

December 2014

One of the advantages of deep grammars, such as those based on HPSG, is that they can be used for ge...

Two reproductions of a human-assessed comparative evaluation of a semantic error detection system

Huidrom, Rudali
Dušek, Ondřej
Kasner, Zdeněk
Castro Ferrera, Thiago
Belz, Anya

July 2022

In this paper, we present the results of two re- production studies for the human evaluation origina...

Semantic Accuracy in Natural Language Generation: A Thesis Proposal

Schmidtová, Patrícia

January 2023

With the fast-growing popularity of current large pre-trained language models (LLMs), it is necessar...

Comparing automatic and human evaluation of NLG systems

Anja Belz

January 2006

We consider the evaluation problem in Natural Language Generation (NLG) and present results for eval...

Twenty years of confusion in human evaluation: NLG needs evaluation sheets and standardised definitions

Howcroft, David
Belz, Anya
Gkatzia, Dimitra
Clinciu, Miruna
Hasan, Sadid
Mahamood, Saad
Mille, Simon
van Miltenburg, Emiel
Santhanam, Sashank
Rieser, Verena

December 2020

Human assessment remains the most trusted form of evaluation in NLG, but highly diverse approaches a...

Barriers and enabling factors for error analysis in NLG research

Abstract

Extracted data

Barriers and enabling factors for error analysis in NLG research

Abstract

Extracted data

Related items

Related items