This paper addresses the problem of learning word im-age representations: given the cropped image of a word, we are interested in finding a descriptive, robust, and compact fixed-length representation. Machine learning techniques can then be supplied with these representations to produce models useful for word retrieval or recognition tasks. Al-though many works have focused on the machine learning aspect once a global representation has been produced, lit-tle work has been devoted to the construction of those base image representations: most works use standard coding and aggregation techniques directly on top of standard com-puter vision features such as SIFT or HOG. We propose to learn local mid-level features suitable for building word i...
AbstractWe introduce using images for word sense disambiguation, either alone, or in conjunction wit...
Abstract—Recognizing text in images taken in the wild is a challenging problem that has received gre...
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of obje...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
Abstract. We treat object recognition as a process of attaching words to images and image regions. T...
Visual Category Recognition aims at fast classification of objects, as well as scenery, action, and ...
Bag-of-Words lies at a heart of modern object category recognition systems. After descriptors are ex...
We describe a model of object recognition as machine translation. In this model, recognition is a pr...
In the recent years, the emergence of deep learning models has greatly advanced computer vision and ...
Labeling image collections is a tedious task, especially when multiple labels have to be chosen for ...
Jamieson M, Fazly A, Stevenson S, Dickinson S, Wachsmuth S. Using Language to Learn Structured Appea...
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segme...
We present a novel approach to learning word embeddings by exploring subword information (character ...
In this paper we propose a novel computer vision method for classifying human facial expression from...
AbstractWe introduce using images for word sense disambiguation, either alone, or in conjunction wit...
Abstract—Recognizing text in images taken in the wild is a challenging problem that has received gre...
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of obje...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
Abstract. We treat object recognition as a process of attaching words to images and image regions. T...
Visual Category Recognition aims at fast classification of objects, as well as scenery, action, and ...
Bag-of-Words lies at a heart of modern object category recognition systems. After descriptors are ex...
We describe a model of object recognition as machine translation. In this model, recognition is a pr...
In the recent years, the emergence of deep learning models has greatly advanced computer vision and ...
Labeling image collections is a tedious task, especially when multiple labels have to be chosen for ...
Jamieson M, Fazly A, Stevenson S, Dickinson S, Wachsmuth S. Using Language to Learn Structured Appea...
We present a new approach for modeling multi-modal data sets, focusing on the specific case of segme...
We present a novel approach to learning word embeddings by exploring subword information (character ...
In this paper we propose a novel computer vision method for classifying human facial expression from...
AbstractWe introduce using images for word sense disambiguation, either alone, or in conjunction wit...
Abstract—Recognizing text in images taken in the wild is a challenging problem that has received gre...
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of obje...