Translation systems are generally trained to optimize BLEU, but many alternative metrics are available. We explore how optimizing toward various automatic evaluation metrics (BLEU, METEOR, NIST, TER) affects the re-sulting model. We train a state-of-the-art MT system using MERT on many parameteriza-tions of each metric and evaluate the result-ing models on the other metrics and also us-ing human judges. In accordance with popular wisdom, we find that it’s important to train on the same metric used in testing. However, we also find that training to a newer metric is only useful to the extent that the MT model’s struc-ture and features allow it to take advantage of the metric. Contrasting with TER’s good cor-relation with human judgments, we ...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
The process of developing hybrid MT systems is usually guided by an evaluation method used to compar...
State-of-the-art MT systems use so called log-linear model, which combines several components to pre...
Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the corre...
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no sing...
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no sing...
MT evaluation metrics are tested for correlation with human judgments either at the sentence- or the...
Minimum error rate training (MERT) in-volves choosing parameter values for a machine translation (MT...
We present the first ever results show-ing that tuning a machine translation sys-tem against a seman...
International audienceThe main metric used for SMT systems evaluation an optimisation is BLEU score ...
In Minimum Error Rate Training (MERT), the parameters of an SMT system are tuned on a certain evalua...
Machine Translation (MT) systems are more complex to test than they appear to be at first, since man...
In Minimum Error Rate Training (MERT), the parameters of an SMT system are tuned on a certain evalua...
We study the impact of source length and verbosity of the tuning dataset on the per-formance of para...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
The process of developing hybrid MT systems is usually guided by an evaluation method used to compar...
State-of-the-art MT systems use so called log-linear model, which combines several components to pre...
Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the corre...
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no sing...
Evaluation of machine translation (MT) output is a challenging task. In most cases, there is no sing...
MT evaluation metrics are tested for correlation with human judgments either at the sentence- or the...
Minimum error rate training (MERT) in-volves choosing parameter values for a machine translation (MT...
We present the first ever results show-ing that tuning a machine translation sys-tem against a seman...
International audienceThe main metric used for SMT systems evaluation an optimisation is BLEU score ...
In Minimum Error Rate Training (MERT), the parameters of an SMT system are tuned on a certain evalua...
Machine Translation (MT) systems are more complex to test than they appear to be at first, since man...
In Minimum Error Rate Training (MERT), the parameters of an SMT system are tuned on a certain evalua...
We study the impact of source length and verbosity of the tuning dataset on the per-formance of para...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
The process of developing hybrid MT systems is usually guided by an evaluation method used to compar...