Labeling image collections is a tedious task, especially when multiple labels have to be chosen for each image. In this paper we introduce a new framework that extends state of the art models in word prediction to incorporate information from unlabeled examples, using manifold regularization. To the best of our knowledge this is the first semi-supervised multi-task model used in vision problems. The new model can be solved using gradient descent and is fast and efficient. We show remarkable improvements for cases with few labeled examples for challenging multi-task learning problems in vision (predicting words for images and attributes for objects)
Scene recognition has been widely studied to understand visual information from the level of objects...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
In many real world applications we do not have access to fully-labeled training data, but only to a ...
In many real world applications we do not have access to fully-labeled training data, but only to a ...
It is a significant challenge to classify images with multiple labels by using only a small number o...
One of the challenges in image search is to learn with few labeled examples. Existing solutions main...
International audienceWe propose structured models for image labeling that take into account the dep...
In many real-world applications of supervised learning, only a limited number of labeled examples ar...
We present a new paradigm for fine-tuning large-scale vision-language pre-trained models on downstre...
Search-based structured prediction methods have shown promising successes in both computer vision an...
By utilizing the label dependencies among both the labeled and unlabeled data, semi-supervised learn...
Visual recognition systems are often limited to the object categories previously trained on and thus...
Visual attributes expose human-defined semantics to object recognition models, but existing work lar...
In multi-label learning, each training example is represented by a single instance (feature vector) ...
Scene recognition has been widely studied to understand visual information from the level of objects...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
In many real world applications we do not have access to fully-labeled training data, but only to a ...
In many real world applications we do not have access to fully-labeled training data, but only to a ...
It is a significant challenge to classify images with multiple labels by using only a small number o...
One of the challenges in image search is to learn with few labeled examples. Existing solutions main...
International audienceWe propose structured models for image labeling that take into account the dep...
In many real-world applications of supervised learning, only a limited number of labeled examples ar...
We present a new paradigm for fine-tuning large-scale vision-language pre-trained models on downstre...
Search-based structured prediction methods have shown promising successes in both computer vision an...
By utilizing the label dependencies among both the labeled and unlabeled data, semi-supervised learn...
Visual recognition systems are often limited to the object categories previously trained on and thus...
Visual attributes expose human-defined semantics to object recognition models, but existing work lar...
In multi-label learning, each training example is represented by a single instance (feature vector) ...
Scene recognition has been widely studied to understand visual information from the level of objects...
This paper addresses the problem of learning word im-age representations: given the cropped image of...
This paper addresses the problem of learning word im-age representations: given the cropped image of...