We present the MT-NCD and MT-mNCD machine translation evaluation metrics as submission to the machine transla-tion evaluation shared task (MetricsMATR 2010). The metrics are based on nor-malized compression distance (NCD), a general information theoretic measure of string similarity, and evaluated against hu-man judgments from the WMT08 shared task. The experiments show that 1) our metric improves correlation to hu-man judgments by using flexible match-ing, 2) segment replication is effective, and 3) our NCD-inspired method for mul-tiple references indicates improved results. Generally, the proposed MT-NCD and MT-mNCD methods correlate competi-tively with human judgments compared to commonly used machine translations eval-uation metrics, fo...
This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task...
We propose three new features for MT evaluation: source-sentence constrained n-gram precision, sourc...
This paper aims at providing a reliable method for measuring the correlations between different scor...
We present the MT-NCD and MT-mNCD machine translation evaluation metrics as submission to the machin...
Traditional machine translation evaluation metrics such as BLEU and WER have been widely used, but t...
State-of-the-art MT systems use so called log-linear model, which combines several components to pre...
Automatic Machine Translation (MT) evaluation is an active field of research, with a handful of new ...
The success of Transformer architecture has seen increased interest in machine translation (MT). The...
This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task...
The development of machine translation systems depends on the evaluation of their results. However,...
Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the corre...
As described in this paper, we pro-pose a new automatic evaluation met-ric for machine translation. ...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
Automatic Machine Translation metrics, such as BLEU, are widely used in empirical evaluation as a su...
We propose three new features for MT evaluation: source-sentence constrained n-gram precision, sourc...
This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task...
We propose three new features for MT evaluation: source-sentence constrained n-gram precision, sourc...
This paper aims at providing a reliable method for measuring the correlations between different scor...
We present the MT-NCD and MT-mNCD machine translation evaluation metrics as submission to the machin...
Traditional machine translation evaluation metrics such as BLEU and WER have been widely used, but t...
State-of-the-art MT systems use so called log-linear model, which combines several components to pre...
Automatic Machine Translation (MT) evaluation is an active field of research, with a handful of new ...
The success of Transformer architecture has seen increased interest in machine translation (MT). The...
This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task...
The development of machine translation systems depends on the evaluation of their results. However,...
Automatic Machine Translation (MT) evaluation metrics have traditionally been evaluated by the corre...
As described in this paper, we pro-pose a new automatic evaluation met-ric for machine translation. ...
This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08...
Automatic Machine Translation metrics, such as BLEU, are widely used in empirical evaluation as a su...
We propose three new features for MT evaluation: source-sentence constrained n-gram precision, sourc...
This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task...
We propose three new features for MT evaluation: source-sentence constrained n-gram precision, sourc...
This paper aims at providing a reliable method for measuring the correlations between different scor...