The automatic evaluation of image descriptions is an intricate task, and it is highly important in the development and fine-grained analysis of captioning systems. Existing metrics to automatically evaluate image captioning systems fail to achieve a satisfactory level of correlation with human judgements at the sentence level. Moreover, these metrics, unlike humans, tend to focus on specific aspects of quality, such as the n-gram overlap or the semantic meaning. In this paper, we present the first learning-based metric to evaluate image captions. Our proposed framework enables us to incorporate both lexical and semantic information into a single learned metric. This results in an evaluator that takes into account various linguistic features...
Two recent approaches have achieved state-of-the-art results in image caption-ing. The first uses a ...
Vision and language models are easily transferred to other tasks. In particular, they have been demo...
Each time we ask for an object, describe a scene, follow directions or read a document containi...
The evaluation of image caption quality is a challenging task, which requires the assessment of two ...
There is considerable interest in the task of automatically generating image captions. However, eval...
Generating a description of an image is called image captioning. Image captioning requires recognizi...
The automatic generation of image captions has received considerable attention. The problem of evalu...
The automatic generation of image cap-tions has received considerable attention. The problem of eval...
Evaluation of generative models, in the visual domain, is often performed providing anecdotal result...
Trabajo presentado al Workshop on Vision and Language (VL’15), celebrado en Lisboa (Portugal) el 18 ...
With the development of deep learning, the combination of computer vision and natural language proce...
With the development of deep learning, the combination of computer vision and natural language proce...
Vision and language models are easily transferred to other tasks. In particular, they have been demo...
Automatic image caption prediction is a challenging task in natural language processing. Most of the...
Vision and language models are easily transferred to other tasks. In particular, they have been demo...
Two recent approaches have achieved state-of-the-art results in image caption-ing. The first uses a ...
Vision and language models are easily transferred to other tasks. In particular, they have been demo...
Each time we ask for an object, describe a scene, follow directions or read a document containi...
The evaluation of image caption quality is a challenging task, which requires the assessment of two ...
There is considerable interest in the task of automatically generating image captions. However, eval...
Generating a description of an image is called image captioning. Image captioning requires recognizi...
The automatic generation of image captions has received considerable attention. The problem of evalu...
The automatic generation of image cap-tions has received considerable attention. The problem of eval...
Evaluation of generative models, in the visual domain, is often performed providing anecdotal result...
Trabajo presentado al Workshop on Vision and Language (VL’15), celebrado en Lisboa (Portugal) el 18 ...
With the development of deep learning, the combination of computer vision and natural language proce...
With the development of deep learning, the combination of computer vision and natural language proce...
Vision and language models are easily transferred to other tasks. In particular, they have been demo...
Automatic image caption prediction is a challenging task in natural language processing. Most of the...
Vision and language models are easily transferred to other tasks. In particular, they have been demo...
Two recent approaches have achieved state-of-the-art results in image caption-ing. The first uses a ...
Vision and language models are easily transferred to other tasks. In particular, they have been demo...
Each time we ask for an object, describe a scene, follow directions or read a document containi...