Abstract. We treat object recognition as a process of attaching words to images and image regions. To accomplish this we exploit clustering methods which learn the joint statistics of words and image regions. We show how these models can then be used to attach words to images outside the training set. This “auto-annotation ” process has applications such as image indexing, as well as being related to object recognition. Predicted words can be compared to actual words associated with images in a held out set, and we introduce several performance measures based on this observation. These measures are then used to make principled comparisons of model variants, and proposed enhancements. Word prediction is most simply done as a function of the ...
Recent works in object recognition often use visual words, i.e. vector quantized local descriptors e...
We present a new an very rich approach for moeling multi-moal ata sets, focusing on the specific ...
AbstractWe introduce using images for word sense disambiguation, either alone, or in conjunction wit...
We describe a model of object recognition as machine translation. In this model, recognition is a pr...
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segme...
We approach the object recognition problem as the process of attaching meaningful labels to specific...
Feature selection is very important for many computer vision applications. However, it is hard to fi...
During the last decade, machine learning techniques have been used successfully in many applications...
The problem of learning language models from large text corpora has been widely stud-ied within the ...
We present on-going work on the topic of learn-ing translation models between image data and text (E...
Object recognition in images is a popular research field with many applications including medicine, ...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
We present a novel method for constructing a visual vocabulary that takes into account the class lab...
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of obje...
Recent works in object recognition often use visual words, i.e. vector quantized local descriptors e...
We present a new an very rich approach for moeling multi-moal ata sets, focusing on the specific ...
AbstractWe introduce using images for word sense disambiguation, either alone, or in conjunction wit...
We describe a model of object recognition as machine translation. In this model, recognition is a pr...
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segme...
We approach the object recognition problem as the process of attaching meaningful labels to specific...
Feature selection is very important for many computer vision applications. However, it is hard to fi...
During the last decade, machine learning techniques have been used successfully in many applications...
The problem of learning language models from large text corpora has been widely stud-ied within the ...
We present on-going work on the topic of learn-ing translation models between image data and text (E...
Object recognition in images is a popular research field with many applications including medicine, ...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
We present a novel method for constructing a visual vocabulary that takes into account the class lab...
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of obje...
Recent works in object recognition often use visual words, i.e. vector quantized local descriptors e...
We present a new an very rich approach for moeling multi-moal ata sets, focusing on the specific ...
AbstractWe introduce using images for word sense disambiguation, either alone, or in conjunction wit...