A recent paper described a new machine translation evaluation metric, AMBER. This paper describes two changes to AMBER. The first one is incorporation of a new ordering penalty; the second one is the use of the downhill simplex algorithm to tune the weights for the components of AMBER. We tested the impact of the two changes, using data from the WMT metrics task. Each of the changes by itself improved the performance of AMBER, and the two together yielded even greater improvement, which in some cases was more than additive. The new version of AMBER clearly outperforms BLEU in terms of correlation with human judgment.Peer reviewed: YesNRC publication: Ye
Meteor is an automatic metric for Ma-chine Translation evaluation which has been demonstrated to hav...
Automatic Machine Translation (MT) evaluation is an active field of research, with a handful of new ...
Machine Translation (MT) systems are more complex to test than they appear to be at first, since man...
A recent paper described a new machine translation evaluation metric, AMBER. This paper describes tw...
This paper proposes a new automatic machine translation evaluation metric: AMBER, which is based on ...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the corre...
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no sing...
Meteor is an automatic metric for Machine Translation evaluation which has been demonstrated to have...
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no sing...
Automatic metrics are fundamental for the development and evaluation of machine translation systems....
Many machine translation (MT) evaluation metrics have been shown to correlate better with human judg...
Translation systems are generally trained to optimize BLEU, but many alternative metrics are availab...
State-of-the-art MT systems use so called log-linear model, which combines several components to pre...
Meteor is an automatic metric for Ma-chine Translation evaluation which has been demonstrated to hav...
Automatic Machine Translation (MT) evaluation is an active field of research, with a handful of new ...
Machine Translation (MT) systems are more complex to test than they appear to be at first, since man...
A recent paper described a new machine translation evaluation metric, AMBER. This paper describes tw...
This paper proposes a new automatic machine translation evaluation metric: AMBER, which is based on ...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the corre...
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no sing...
Meteor is an automatic metric for Machine Translation evaluation which has been demonstrated to have...
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no sing...
Automatic metrics are fundamental for the development and evaluation of machine translation systems....
Many machine translation (MT) evaluation metrics have been shown to correlate better with human judg...
Translation systems are generally trained to optimize BLEU, but many alternative metrics are availab...
State-of-the-art MT systems use so called log-linear model, which combines several components to pre...
Meteor is an automatic metric for Ma-chine Translation evaluation which has been demonstrated to hav...
Automatic Machine Translation (MT) evaluation is an active field of research, with a handful of new ...
Machine Translation (MT) systems are more complex to test than they appear to be at first, since man...